Kimi Dev 72b устанавливает новый стандарт среди open-source моделей для кодинга, достигая рекордных 60,4% на тесте SWE-bench. Благодаря обучению с подкреплением (RL), ИИ умеет самостоятельно исправлять баги в реальных репозиториях и верифицировать решения.
1. Kimi-Dev-72B achieves 60.4% performance on SWE-bench Verified. It surpasses the runner-up, setting a new state-of-the-art result among open-source models. 2. Kimi-Dev-72B is optimized via large-scale reinforcement learning. It autonomously patches real repositories in Docker and gains rewards only when the entire test suite passes. This ensures correct and robust solutions, aligning with real-world development standards. 3. Kimi-Dev-72B is available for download and deployment on Hugging Face and GitHub. We welcome developers and researchers to explore its capabilities and contribute to development.