Zikang Tian | AI Researcher

Selected Publications

Code Driven Planning with Domain-Adaptive Selector First Author

Zikang Tian, Shaohui Peng, Di Huang, Yewen Pu, Xing Hu, Yunji Chen, et al.

ICLR 2026 (CCF-A)

Designed a code-generation-based agent planning framework supporting ALFWorld, Nethack, and StarCraft II. Proposed Domain-Adaptive Selector via RL fine-tuning. +19% success rate, -79% token consumption vs. SOTA.

Decompose a task into generalizable subtasks in multi-agent reinforcement learning First Author Cited: 27

Zikang Tian, Ruizhi Chen, Xing Hu, Ling Li, Rui Zhang, Yunji Chen, et al.

NeurIPS 2023 (CCF-A)

Proposed task-agnostic skill extraction with adaptive semantic construction for zero-shot multi-agent generalization. +22%~34% success rate on unseen tasks.

Code

Boosting VLM's Spatial Intelligence via Latent Space World Model First Author

Zikang Tian, Di Huang, Shaohui Peng, Xing Hu, Yunji Chen, et al.

Nature Machine Intelligence (Under Review)

Proposed multi-expert Latent Space World Model enabling VLMs with "mental simulation" for 3D spatial reasoning. +51% success rate on MRT, 3D origami, and block assembly benchmarks.

HYVIN: Grounding Large Language Models with Self-Driven Skill Learning Co-author Cited: 11

Shaohui Peng, Xing Hu, ..., Zikang Tian, Yunji Chen, et al.

AAAI 2024

Designed "Hypothesis-Verification-Induction-Deduction" framework for automated skill learning. +81.32% success rate on BabyAI over LLM baselines.

Luban: Building open-ended creative agents via autonomous embodied verification Co-author Cited: 5

Yuxuan Guo, ..., Zikang Tian, et al.

Nature Machine Intelligence (Under Review)

Proposed an embodied agent with automated verification for open-ended functional design tasks with open goals and abstract criteria.

Transfer-Controllable Policy for Model Protection in Deep Reinforcement Learning Co-author

Ruizhi Chen, ..., Zikang Tian, Yunji Chen, et al.

Under Review

Proposed Transfer-Controllable RL (TCRL) framework to protect deep RL policy models from unauthorized transfer.

Experience

Cambricon — Embodied Large Model Algorithm Intern 2024.10 – 2025.04

Advisor: Prof. Xing Hu · Beijing

Built teleoperation data collection platform with dexterous hand, UR5e arm, and Apple Vision Pro for VLA training
Fine-tuned Qwen2.5-VL-7B for robotic task planning & control, achieving 87% avg success rate on tabletop tasks

Cambricon — Virtual World & RL Algorithm Intern 2021.05 – 2021.10

Advisor: Prof. Zidong Du · Beijing

Designed virtual survival environment Eden with diverse task types for agent training and validation
Implemented DQN, PPO, QMIX, OpenAI Five with stable convergence across survival tasks

Technical Skills

Programming & Tools

Python, C/C++, Linux, Shell, Git

LLM & Agent

Transformer, LLaMA, Qwen, LLaMA-Factory (SFT/DPO/PPO/GRPO/DAPO), LoRA/QLoRA, ReAct, Tool Use, Code Generation

Reinforcement Learning

DQN, PPO, SAC, A3C, Reward Shaping, Multi-Agent, Robosuite, MimicGen, Robomimic

English

CET-6: 579

NeurIPS 2025 Reviewer, AAAI 2026 Reviewer	Academic Service
Merit Student, University of Chinese Academy of Sciences	2024
Outstanding Student, State Key Lab of Processors, ICT, CAS	2024
Second-class Scholarship, Hebei University of Technology	2019, 2020
Third-class Scholarship, Hebei University of Technology	2017, 2018

Zikang Tian | 田子康