Ma Xiaoteng
Chief Scientist at Macaron AI
Ma Xiaoteng is Chief Scientist at Macaron AI and Director of Mind Lab. His work focuses on training personalized intelligent agents driven by real interactive experience, as well as building reinforcement learning infrastructure platforms. He earned his Ph.D. in the Department of Automation at Tsinghua University, where he also completed his postdoctoral research. He has published more than 30 papers in the field of reinforcement learning, with over 1,400 citations on Google Scholar.
Topic
Towards Experiential Intelligence: From Context Engineering to Context Learning
With the continuous enhancement of large model capabilities, agents are becoming a core form of applied intelligence. However, achieving true experiential intelligence requires more than manually curated context (Context Engineering); it demands that models learn, accumulate, and reuse experience from real interaction data—moving toward Context Learning. This talk will present Mind Lab’s engineering practices in this area. We developed **MinT**, a LoRA-RL training foundation for post-training trillion-parameter models, enabling high-throughput, low-cost reinforcement learning iterations. Using the Macaron model training as a case study, we will demonstrate how Context Learning teaches models to operate dynamic UIs, turning interaction experience into reusable model capabilities and training pipelines. Project: [https://github.com/MindLab-Research/mindlab-toolkit](https://github.com/MindLab-Research/mindlab-toolkit) Key topics include: Large Model Post-Training Evolution: From Context Engineering to Context Learning * Compare the two capability-enhancement paths and explain why, in agent scenarios, learning from interaction is critical for achieving experiential intelligence. Core Challenges of Context Learning * Discuss the difficulties in real interactions: constructing effective training signals, modeling long-term memory, and ensuring experience reusability. MinT: LoRA-RL Training Foundation for Trillion-Parameter Models * Present MinT’s core design and engineering implementation, showing how it achieves high-throughput, low-cost post-training and supports rapid iteration in agent and tool-use scenarios. Macaron Case Study: Training Dynamic UI Interaction via Context Learning * Using the Macaron model as an example, illustrate how experience trajectories are extracted from user interaction data, embedding multi-step UI interaction capabilities into model parameters and forming an iterative training pipeline.