Zikang Tian

Zikang Tian | 田子康

Ph.D. Candidate, Institute of Computing Technology, Chinese Academy of Sciences (ICT, CAS)

Advisors: Prof. Xing Hu & Prof. Zidong Du

My research focuses on building intelligent agents that can plan, reason, and act in complex environments. I am particularly interested in Reinforcement Learning, LLM-based Agents, Planning, and Reasoning via World Models.

Selected Publications

Zikang Tian, Shaohui Peng, Di Huang, Yewen Pu, Xing Hu, Yunji Chen, et al.
ICLR 2026 (CCF-A)
Designed a code-generation-based agent planning framework supporting ALFWorld, Nethack, and StarCraft II. Proposed Domain-Adaptive Selector via RL fine-tuning. +19% success rate, -79% token consumption vs. SOTA.
Zikang Tian, Ruizhi Chen, Xing Hu, Ling Li, Rui Zhang, Yunji Chen, et al.
NeurIPS 2023 (CCF-A)
Proposed task-agnostic skill extraction with adaptive semantic construction for zero-shot multi-agent generalization. +22%~34% success rate on unseen tasks.
Boosting VLM's Spatial Intelligence via Latent Space World Model First Author
Zikang Tian, Di Huang, Shaohui Peng, Xing Hu, Yunji Chen, et al.
Nature Machine Intelligence (Under Review)
Proposed multi-expert Latent Space World Model enabling VLMs with "mental simulation" for 3D spatial reasoning. +51% success rate on MRT, 3D origami, and block assembly benchmarks.
Shaohui Peng, Xing Hu, ..., Zikang Tian, Yunji Chen, et al.
AAAI 2024
Designed "Hypothesis-Verification-Induction-Deduction" framework for automated skill learning. +81.32% success rate on BabyAI over LLM baselines.
Yuxuan Guo, ..., Zikang Tian, et al.
Nature Machine Intelligence (Under Review)
Proposed an embodied agent with automated verification for open-ended functional design tasks with open goals and abstract criteria.
Ruizhi Chen, ..., Zikang Tian, Yunji Chen, et al.
Under Review
Proposed Transfer-Controllable RL (TCRL) framework to protect deep RL policy models from unauthorized transfer.

Experience

Cambricon — Embodied Large Model Algorithm Intern 2024.10 – 2025.04
Advisor: Prof. Xing Hu · Beijing
  • Built teleoperation data collection platform with dexterous hand, UR5e arm, and Apple Vision Pro for VLA training
  • Fine-tuned Qwen2.5-VL-7B for robotic task planning & control, achieving 87% avg success rate on tabletop tasks
Cambricon — Virtual World & RL Algorithm Intern 2021.05 – 2021.10
Advisor: Prof. Zidong Du · Beijing
  • Designed virtual survival environment Eden with diverse task types for agent training and validation
  • Implemented DQN, PPO, QMIX, OpenAI Five with stable convergence across survival tasks

Technical Skills

Programming & Tools

Python, C/C++, Linux, Shell, Git

LLM & Agent

Transformer, LLaMA, Qwen, LLaMA-Factory (SFT/DPO/PPO/GRPO/DAPO), LoRA/QLoRA, ReAct, Tool Use, Code Generation

Reinforcement Learning

DQN, PPO, SAC, A3C, Reward Shaping, Multi-Agent, Robosuite, MimicGen, Robomimic

English

CET-6: 579

Honors & Service

NeurIPS 2025 Reviewer, AAAI 2026 ReviewerAcademic Service
Merit Student, University of Chinese Academy of Sciences2024
Outstanding Student, State Key Lab of Processors, ICT, CAS2024
Second-class Scholarship, Hebei University of Technology2019, 2020
Third-class Scholarship, Hebei University of Technology2017, 2018