Zhang Quanlu
Vice President of Technology at Unanswered Chip Horizon
Quanlu Zhang, Vice President of Technology at Unanswered Chip Horizon, and former Principal Researcher in the Systems Research Group at Microsoft Research Asia, has extensive research and practical experience in intelligent computing. His research interests include distributed training systems for large models, heterogeneous mixed training, large-scale GPU cluster management and job scheduling, compilation acceleration for quantized and sparse models, and the design and development of automated machine learning systems. He has published multiple papers in top systems conferences such as OSDI, SOSP, EuroSys, ATC, and FAST.
Topic
An Open-Source Reinforcement Learning Framework for Embodied Intelligence: RLinf with Integrated Rendering, Training, and Inference
Reinforcement Learning (RL) has shown tremendous potential in advancing general embodied intelligence. However, due to the inherent heterogeneity and dynamic nature of RL workflows, existing systems commonly face challenges such as limited scenario support, low hardware utilization, and slow training speeds. This talk will provide a systematic introduction to RLinf, our high-performance RL training framework. Built on the innovative Macro-Micro Flow Transformation (M2Flow) system design paradigm, RLinf not only offers an easy-to-use interface for constructing training pipelines but also enables highly flexible scheduling of training components. RLinf provides comprehensive support for embodied intelligence scenarios—covering embodied models such as pi0, pi0.5, and OpenVLA, multiple simulation environments such as ManiSkill, LIBERO, and RoboTwin, as well as support for real-world reinforcement learning on physical robots. In embodied tasks, RLinf achieves 120% system acceleration and 40%–60% model performance improvement, effectively empowering agents with efficient training and rapid iteration.