About Me
I am Huatong Song (宋华彤), a first-year Master’s student at the Gaoling School of Artificial Intelligence (GSAI), Renmin University of China (RUC), supervised by Prof. Xin Zhao. I am expected to graduate in June 2028.
I enrolled in GSAI at RUC in September 2021 and received a Bachelor of Engineering degree in Artificial Intelligence in July 2025. In parallel, I also obtained a Bachelor’s degree in Finance from the Gaoli Institute of RUC. I have a broad interest in Natural Language Processing, Large Language Model and Agent.
Experience
- 2026.04 – Recent · IQuest Research · LLM Research Intern ·
- 2025.11 – 2026.04 · Boss Zhipin Nanbeige · LLM Research Intern ·
- 2025.05 – 2025.11 · ByteDance Seed-Edge · LLM Research Intern ·
Publications
( * indicates equal contribution, † indicates corresponding author )
ClawGym: A Scalable Framework for Building Effective Claw Agents
Fei Bai*, Huatong Song*, Shuang Sun*, Daixuan Cheng, Yike Yang, Chuan Hao, Renyuan Li, Feng Chang, Yuan Wei, Ran Tao, Bryan Dai, Jian Yang, Wayne Xin Zhao† ( Authors marked with * are ordered alphabetically )Technical Report
SWE-Master: Unleashing the Potential of Software Engineering Agents via Post-Training
Huatong Song*, Lisheng Huang*, Shuang Sun*, Jinhao Jiang*, Ran Le, Daixuan Cheng, Guoxin Chen, Yiwen Hu, Zongchao Chen, Wayne Xin Zhao†, Yang Song†, Tao Zhang, Ji-Rong Wen
Technical ReportSWE-World: Building Software Engineering Agents in Docker-Free Environments
Shuang Sun*, Huatong Song*, Lisheng Huang*, Jinhao Jiang*, Ran Le, Zhihao Lv, Zongchao Chen, Yiwen Hu, Wenyang Luo, Wayne Xin Zhao†, Yang Song†, Hongteng Xu, Tao Zhang, Ji-Rong Wen
PreprintSeed1.8 Model Card: Towards Generalized Real-World Agency
Technical ReportUi-tars-2 technical report: Advancing gui agent with multi-turn reinforcement learning
Haoming Wang, Haoyang Zou, Huatong Song, Jiazhan Feng, Junjie Fang… ( in alphabetical order )
Technical ReportR1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning
Huatong Song*, Jinhao Jiang*, Wenqing Tian, Zhipeng Chen, Yuhuan Wu, Jiahao Zhao, Yingqian Min, Wayne Xin Zhao†, Lei Fang, Ji-Rong Wen
EMNLP-Findings, 2025SimpleDeepSearcher: Deep information seeking via web-powered reasoning trajectory synthesis
Shuang Sun*, Huatong Song*, Yuhao Wang, Ruiyang Ren, Jinhao Jiang, Junjie Zhang, Fei Bai, Jia Deng, Wayne Xin Zhao†, Zheng Liu, Lei Fang, Zhongyuan Wang, Ji-Rong Wen
EMNLP-Findings, 2025R1-searcher: Incentivizing the search capability in llms via reinforcement learning
Huatong Song*, Jinhao Jiang*, Yingqian Min, Jie Chen, Zhipeng Chen, Wayne Xin Zhao, Lei Fang, Ji-Rong Wen
Technical ReportYulan-mini: An open data-efficient language model
Hu Yiwen*, Huatong Song*, Jie Chen, Jia Deng, Jiapeng Wang, Kun Zhou, Yutao Zhu, Jinhao Jiang, Zican Dong, Yang Lu, Xu Miao, Wayne Xin Zhao†, Ji-Rong Wen
Technical ReportYuLan-Mini: Pushing the Limits of Open Data-efficient Language Model
Hu Yiwen*, Huatong Song*, Jie Chen, Jia Deng, Jiapeng Wang, Kun Zhou, Yutao Zhu, Jinhao Jiang, Zican Dong, Yang Lu, Xu Miao, Wayne Xin Zhao†, Ji-Rong Wen
ACL-Main-Oral, 2025- Imitate, explore, and self-improve: A reproduction report on slow-thinking reasoning systems
Yingqian Min, Zhipeng Chen, Jinhao Jiang, Jie Chen, Jia Deng, Yiwen Hu, Yiru Tang, Jiapeng Wang, Xiaoxue Cheng, Huatong Song, Wayne Xin Zhao†, Zheng Liu, Zhongyuan Wang, Ji-Rong Wen
Technical Report LLM-in-Sandbox Elicits General Agentic Intelligence
Daixuan Cheng, Shaohan Huang, Yuxian Gu, Huatong Song, Guoxin Chen, Li Dong, Wayne Xin Zhao†, Ji-Rong Wen, Furu Wei† Preprint- BeyondSWE: Can Current Code Agent Survive Beyond Single-Repo Bug Fixing?
Guoxin Chen, Fanzhe Meng, Jiale Zhao, Minghao Li, Daixuan Cheng, Huatong Song, Jie Chen, Yuzhi Lin, Hui Chen, Xin Zhao, Ruihua Song, Chang Liu, Cheng Chen, Kai Jia, Ji-Rong Wen† Preprint
Awards
- 2025 Outstanding Undergraduate Graduation Thesis (Top 5% of the GSAI)
- 2024 China National Scholarship (top 1.5%)
- 2024 Linghang Dean’s Scholarship, GSAI
- 2022 First-Class Academic Scholarship, RUC
