Shenggui Li

Core Developer of SGLang, Ph.D. Student at Nanyang Technological University, Singapore

Shenggui Li is a core developer of SGLang and leads the development of projects such as SpecForge. He is currently a second-year Ph.D. student at Nanyang Technological University, Singapore, under the supervision of Professor Tianwei Zhang. His research focuses on machine learning and high-performance computing, and he has published multiple papers in international conferences such as SC, ACL, and TPDS. In 2021, he was the first to propose the concept of sequence parallelism in large model training. Beyond research, he actively participates in open-source and industry projects, having been a core member in the founding of Luchen Technology and deeply involved in the Colossal-AI and Open-Sora projects.

Topic

SpecForge: An Open-Source Framework for Training Speculative Sampling Models

SpecForge: An Open-Source Framework for Training Speculative Decoding Models SpecForge is an open-source framework developed by the SGLang team, designed to simplify the training workflow for speculative decoding models and to integrate seamlessly with the SGLang inference framework. As a project under LMSYS, SpecForge addresses the shortcomings of existing open-source speculative decoding tools—such as poor maintenance and weak compatibility with inference frameworks. It offers out-of-the-box training capabilities, supports online/offline training, Tensor Parallelism, FSDP, and other distributed strategies, enabling developers to efficiently build high-performance speculative sampling models and accelerate LLM inference deployment. Outline I. Background and Motivation a. The value of speculative decoding b. Limitations of existing tools c. The birth of SpecForge II. Core Features of SpecForge a. Out-of-the-box training capabilities b. Deep compatibility with SGLang c. High performance and flexibility III. Technical Architecture and Core Modules a. Overall architecture b. Core module breakdown c. Example of a training workflow IV. Applications and Case Studies a. Eagle3 model training b. Community use cases V. Conclusion and Outlook a. Project value b. Future directions c. Community involvement VI. Q\&A

© boolan.com 博览 版权所有

沪ICP备15014563号-6

沪公网安备31011502003949号