Wanqing He | 2025 Machine Learning Summit

Wanqing He

Vice President of Qingcheng.ai

Dr. He Wanqing is currently the Vice President of Qingcheng.ai. He previously served as Senior Director at Biren Technology, where he was responsible for Turnkey systems and application optimization. His past roles include Chief Engineer at Intel DCAI, Head of High-Performance Computing at Alibaba Cloud and Senior Technical Expert, CTO of 360 Cloud, and R&D Manager at Motorola and Guodian Power. Dr. He graduated from Shanghai Jiao Tong University in 1999 and has devoted the past 25 years to HPC parallel optimization, cloud computing, and AI application performance tuning. He has also invested significant time in fostering industry-academia-research collaboration within the China Computer Federation (CCF). He has served as a CCF Executive Committee Member, Standing Committee Member of the CCF High Performance Computing Technical Committee, Vice Chair of CCF YOCSEF Headquarters, Vice Chair of ACM Hangzhou, and Chair or Committee Member of the Enterprise Track of the CNCC 2022, 2023, and 2024 Technical Forums. He has been awarded numerous honors, including CCF Honorary Member, CCF Distinguished Speaker, and CCF Outstanding Contribution Award. He has authored three books on parallel development and cloud computing, translated and published five books on the internet, popular science, and engineering technology, and received the 40th Anniversary Outstanding Contribution Award from the Publishing House of Electronics Industry as well as the 2024 Best Translator Award from Cheers Publishing.

Topic

Optimization Techniques for Large-Model Training and Inference and Turnkey Performance Delivery

Introduce the optimization techniques behind Qingcheng Jizhi’s Chitu inference engine and the Bagualu training-optimization toolkit, plus the Turnkey (Taiji) performance-delivery platform. Break down how Chitu achieves joint optimization of algorithms, the inference engine, and operators that goes beyond optimizing individual operator sets. Discuss how engineering optimizations can be delivered as a PaaS product, and share practical applications of combining Bagualu module principles—fine-tuning optimizations, graph compilation, hybrid quantization, memory management, and heterogeneous training. Provide E2D optimization templates for deploying inference via the Turnkey platform (including, but not limited to, affinity tuning, load balancing, and cache optimizations) with real-world examples, and describe the practice of separating PD in Kubernetes clusters. Outline: 1. Problem statement and analysis: mathematical models from scientific computing to AI inference, and their requirements for precision and algorithms 2. The origin and evolution of the Chitu inference engine; technical roadmap 3. Principles of Bagualu training-optimization modules 4. Principles and implementation of the Turnkey (Taiji) performance-delivery engine 5. Optimization case studies

Boolan is a leading IT Education & Consulting company in China. Our core competence is our experts team around the world and their cutting edge technology experience accumulated through decades. Adhering to the tenet of "Global Experts, Global Wisdom", we are dedicated to providing our customers In-house Training,Technical Conference, Software Consulting, Expert Lecture, Seminar, Talent Evaluation and Certification and other services by gathering the world's top IT technology experts. www.boolan.com

沪ICP备15014563号-6