I am a fourth-year Ph.D. student in Computer Science at Zhejiang University’s DCD Lab, jointly advised by Prof. Yang Yang and Prof. Jiarong Xu. I also collaborate closely with Prof. Carl Yang at Emory University. I graduated from Chu Kochen Honors College with bachelor’s degree in Computer Science and I have also been a visiting student at Fudan University and the National University of Singapore.

My research mainly focuses on structured data and large language models, including LLM agents, reasoning, alignment, and embodied intelligence. I am particularly interested in multi-agent systems, agentic reinforcement learning, agent benchmarks, environment-grounded agents, and embodied agents such as humanoid VLA. Furthermore, I have a strong background in graph machine learning, including transfer learning, robustness, and privacy.

🔥 News

2026.06: 🎉 One paper is accepted by KDD 2026 Oral.
2026.01: 🎉 One paper is accepted by WWW 2026 Oral.
2025.11: 🎉 One paper is accepted by AAAI 2026.
2025.09: 🎉 One paper is accepted by NeurIPS 2025.
2024.09: 🎉 Two papers are accepted by NeurIPS 2024.
2024.05: 🎉 One paper is accepted by KDD 2024.
2023.12: 🎉 One paper is accepted by AAAI 2024.
2023.05: 🎉 One paper is accepted by NeurIPS 2023.

📄 Publications

My earlier work focused on graph learning and foundation models for structured data, with an emphasis on transfer, robustness, and privacy. My current research centers on large language models, including LLM agents, multi-agent interaction, agentic RL, reasoning, grounding, and evaluation, preference modeling for alignment, and embodied agents such as humanoid VLA systems.

Agentic RL, Multi-Agent Systems, and Agent Benchmarks

This line of work studies agents as complete systems: how they interact in social environments, how they improve through agentic RL, and how they are grounded and evaluated in realistic long-horizon tasks.

WWW 2026 Oral

Multi-Agent Social Simulation for Proactive Policy Optimization
PolicySim: An LLM-Based Agent Social Simulation Sandbox for Proactive Policy Optimization
Renhong Huang, Ning Tang, Jiarong Xu, Yuxuan Cao, Qingqian Tu, Sheng Guo, Bo Zheng, Huiyuan Liu, Yang Yang.

Oral presentation. In Proceedings of the ACM Web Conference 2026 (WWW’26). [arXiv] [PDF]

KDD 2026 Oral

Agentic RL for Stronger Exploration in LLM Agents
RAPO: Expanding Exploration for LLM Agents via Retrieval-Augmented Policy Optimization
Siwei Zhang, Yun Xiong, Xi Chen, Zi’an Jia, Renhong Huang, Jiarong Xu, Jiawei Zhang.

Oral presentation. In Proceedings of the 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD’26).

arXiv 2026

Environment-Grounded LLM Agents via Automated Configuration
RAT: RunAnyThing via Fully Automated Environment Configuration
Renhong Huang, Dongdong Hua, Yifei Sun, Sitao Ding, Hanyang Yuan, Daixin Wang, Yang Yang.

arXiv preprint. [arXiv]

arXiv 2026

Agent Benchmarking for Long-Horizon Strategic Decision Making
PTCG-Bench: Can LLM Agents Master Pokémon Trading Card Game?
Dongdong Hua, Yifei Sun, Renhong Huang, Feng Gao, Chunping Wang, Yang Yang.

arXiv preprint. [arXiv]

Preference Learning and Alignment

Beyond acting well, agents also need to be aligned with nuanced and diverse preferences. This work studies how preference structures can be explicitly organized rather than treated as a flat signal.

NeurIPS 2025

Structured Preference Modeling for Diversified Recommendation
Tree of Preferences for Diversified Recommendation
Hanyang Yuan, Ning Tang, Tongya Zheng, Jiarong Xu, Xintong Hu, Renhong Huang, Shunyu Liu, Jiacong Hu, Jiawei Chen, Mingli Song.

In Advances in Neural Information Processing Systems 38 (NeurIPS’25).

Selected Graph Learning Foundations

My earlier work in graph machine learning focused on transfer, data-centric learning, and privacy/safety. I keep a concise selection here.

KDD 2024 Oral

Graph Domain Adaptation from a Data-Centric Perspective
Can Modifying Data Address Graph Domain Adaptation?
Renhong Huang, Jiarong Xu, Xin Jiang, Ruichuan An, Yang Yang.

Oral presentation. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD’24).

NeurIPS 2023

Graph Pre-training from a Data-Centric Perspective
Better with Less: A Data-Centric Perspective on Pre-Training Graph Neural Networks
Jiarong Xu, Renhong Huang, Xin Jiang, Yuxuan Cao, Carl Yang, Chunping Wang, Yang Yang.

In Advances in Neural Information Processing Systems 36 (NeurIPS’23).

NeurIPS 2024

Graph Extraction Attack
Extracting Training Data from Molecular Pre-trained Models
Renhong Huang, Jiarong Xu, Zhiming Yang, Xiang Si, Xin Jiang, Hanyang Yuan, Chunping Wang, Yang Yang.

In Advances in Neural Information Processing Systems 37 (NeurIPS’24).

🎖 Honors and Awards

National Scholarship (Top 1%)
Meritorious Winner for American Mathematical Competition (Top 7%)

⚡ Demo - 我做的一些有趣的小网页

📖 Educations

2022.09 - 2025.03, M.Phil student, Zhejiang University, Hangzhou.
2018.09 - 2022.06, Undergraduate, Chu Kochen Honors College, Zhejiang University, Hangzhou.

🗞️ Academic Services

Conference Reviewer: WWW’22/23, IJCAI’23, AAAI’23/25/26, WSDM’23, SMP’23, ICLR’23, ICML’24/26, KDD’23/24/25/26, NeurIPS’24/25.
Teaching Assistant: Artificial Intelligence Algorithms and Systems, Zhejiang University, Fall 2023/Fall 2024.

💻 Research Topics

Large Language Models
LLM Agents
LLM Reasoning and Alignment
Embodied AI and Humanoid VLA