| 2025 Machine Learning Summit

Tech Lead of Multimodal Large Models at Kunlun Wanwei

Responsible for multimodal reasoning, multimodal reward models, and unified understanding-generation tasks. The **Skywork-R1V** series for multimodal reasoning has accumulated nearly 100K downloads on Hugging Face within a single month.

Topic

Multimodal Reasoning and Unified Models

r1v: The world’s first industrial multimodal chain-of-thought reasoning model, designed to transfer textual reasoning capabilities to visual tasks. Its architecture uses a lightweight visual projector to connect text and vision models, and a hybrid optimization framework (iterative SFT + GRPO) to enhance alignment. Adaptive chain-of-thought distillation improves efficiency. With 38 billion parameters, it achieves MMMU 69.0 and MathVista 67.5, leading in textual reasoning and laying the foundation for unified multimodal reasoning. r1v2: Improves hybrid reinforcement learning by using the SSB mechanism to mitigate GRPO “advantage vanishing.” The MPO strategy integrates reward models and rule constraints, and reward thresholds are calibrated to reduce hallucinations. Overall performance is enhanced, achieving MMMU 73.6 and MathVista 74.0, narrowing the gap with closed-source models while balancing specialized and general capabilities. r1v3: Upgrades cross-modal fusion and reinforcement learning with cold-start RL, key reasoning entropy discrimination, optimized visual connectors, and cross-modal causal modeling. Trained on 25,000+ samples, inference speed increases sixfold, and reasoning steps are compressed to 1/6. Achieves MMMU 76.0, surpassing some closed-source models and approaching the level of junior human experts, ranking first in multiple metrics among open-source models.

Boolan is a leading IT Education & Consulting company in China. Our core competence is our experts team around the world and their cutting edge technology experience accumulated through decades. Adhering to the tenet of "Global Experts, Global Wisdom", we are dedicated to providing our customers In-house Training,Technical Conference, Software Consulting, Expert Lecture, Seminar, Talent Evaluation and Certification and other services by gathering the world's top IT technology experts. www.boolan.com

沪ICP备15014563号-6