I am a fourth-year Ph.D. student in Computer Science at Zhejiang University’s DCD Lab, jointly advised by Prof. Yang Yang and Prof. Jiarong Xu. I also collaborate closely with Prof. Carl Yang at Emory University. Prior to my doctoral studies, I obtained my Bachelor’s degree in Computer Science from Zhejiang University. I have also been a visiting student at Fudan University and the National University of Singapore.
My research centers on foundation models for structured data and LLM agents. I am particularly interested in multi-agent systems, agentic reinforcement learning, agent benchmarks, and environment-grounded agents. My current work studies how agents interact in social systems, explore and improve through reinforcement learning, and behave in long-horizon tasks with realistic tool use and execution feedback. I also work on preference modeling for alignment, and I am currently exploring embodied agents and humanoid VLA. More broadly, I have a strong background in graph machine learning, including transfer learning, robustness, and privacy.
🔥 News
- 2026.01: 🎉 One paper is accepted by WWW 2026 Oral.
- 2025.11: 🎉 One paper is accepted by AAAI 2026.
- 2025.09: 🎉 One paper is accepted by NeurIPS 2025.
- 2024.09: 🎉 Two papers are accepted by NeurIPS 2024.
- 2024.05: 🎉 One paper is accepted by KDD 2024.
- 2023.12: 🎉 One paper is accepted by AAAI 2024.
- 2023.05: 🎉 One paper is accepted by NeurIPS 2023.
🍉 Publications
My earlier work focused on graph learning foundations and foundation models for structured data, especially transfer, robustness, and privacy. My current research centers on LLM agents, including multi-agent interaction, agentic RL, agent grounding and evaluation, and preference modeling for alignment. I am also exploring embodied agents and humanoid VLA.
🤖 Agentic RL, Multi-Agent Systems, and Agent Benchmarks
This line of work studies agents as complete systems: how they interact in social environments, how they improve through agentic RL, and how they are grounded and evaluated in realistic long-horizon tasks.

Multi-Agent Social Simulation for Proactive Policy Optimization
PolicySim: An LLM-Based Agent Social Simulation Sandbox for Proactive Policy Optimization
Renhong Huang, Ning Tang, Jiarong Xu, Yuxuan Cao, Qingqian Tu, Sheng Guo, Bo Zheng, Huiyuan Liu, Yang Yang.
Oral presentation. In Proceedings of the ACM Web Conference 2026 (WWW’26).

Agentic RL for Stronger Exploration in LLM Agents
RAPO: Expanding Exploration for LLM Agents via Retrieval-Augmented Policy Optimization
Siwei Zhang, Yun Xiong, Xi Chen, Zi’an Jia, Renhong Huang, Jiarong Xu, Jiawei Zhang.
In Proceedings of the 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD’26).

Environment-Grounded LLM Agents via Automated Configuration
RAT: RunAnyThing via Fully Automated Environment Configuration
Renhong Huang, Dongdong Hua, Yifei Sun, Sitao Ding, Hanyang Yuan, Daixin Wang, Yang Yang.
arXiv preprint.

Agent Benchmarking for Long-Horizon Strategic Decision Making
PTCG-Bench: Can LLM Agents Master Pokémon Trading Card Game?
Dongdong Hua, Yifei Sun, Renhong Huang, Feng Gao, Chunping Wang, Yang Yang.
arXiv preprint.
🌿 Preference Learning and Alignment
Beyond acting well, agents also need to align with nuanced and diverse preferences. This work studies how preference structures can be organized explicitly instead of treated as a flat signal.

Structured Preference Modeling for Diversified Recommendation
Tree of Preferences for Diversified Recommendation
Hanyang Yuan, Ning Tang, Tongya Zheng, Jiarong Xu, Xintong Hu, Renhong Huang, Shunyu Liu, Jiacong Hu, Jiawei Chen, Mingli Song.
In Proceedings of Advances in Neural Information Processing Systems (NeurIPS’25).
📚 Selected Graph Learning Foundations
My earlier work in graph machine learning focused on transfer, data-centric learning, and privacy/safety. I keep a concise selection here.

Graph Domain Adaptation from Data-centric Perspective
Can Modifying Data Address Graph Domain Adaptation?
Renhong Huang, Jiarong Xu, Xin Jiang, Ruichuan An, Yang Yang.
In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD’24).

Graph Pre-training from Data-centric Perspective
Better with Less: A Data-Centric Prespective on Pre-Training Graph Neural Networks
Jiarong Xu, Renhong Huang, Xin Jiang, Yuxuan Cao, Carl Yang, Chunping Wang, Yang Yang.
In Proceedings of the 36th Advances in Neural Information Processing Systems (NeurIPS’23).

Graph Extraction Attack
Extracting Training Data from Molecular Pre-trained Models
Renhong Huang, Jiarong Xu, Zhiming Yang, Xiang Si, Xin Jiang, Hanyang Yuan, Chunping Wang, Yang Yang
In Proceedings of the 37th Advances in Neural Information Processing Systems (NeurIPS’24).
🎖 Honors and Awards
National Scholarship (Top 1%) Meritorious Winner for American Mathematical Competition (Top 7%)
⚡ Demo - 我做的一些有趣的小网页
📖 Educations
- 2022.09 - 2025.03, M.Phil student, Zhejiang University, Hangzhou.
- 2018.09 - 2022.06, Undergraduate, Chu Kochen Honors College, Zhejiang University, Hangzhou.
🗞️ Academic Services
- Conference Reviewer: WWW’22/23, IJCAI’23, AAAI’23/25/26, WSDM’23, SMP’23, ICLR’23, ICML’24/26, KDD’23/24/25/26, NeurIPS’24/25.
- Teaching Assistant: Artificial Intelligence Algorithms and Systems, Zhejiang University, Fall 2023/Fall 2024.
💻 Research Topics
- Graph Transfer Learning
- Graph Adversarial Attack
- LLM Agent
- LLM Reasoning