OpenDiLoCo 150M: децентрализованное обучение ИИ

Q: Кто разработал OpenDiLoCo 150M?

Модель OpenDiLoCo 150M разработана компанией Prime Intellect (United States of America).

Q: Какие задачи решает OpenDiLoCo 150M?

Генерация текста, Ответы на вопросы

// задачи

Генерация текстаОтветы на вопросы

// описание

OpenDiLoCo 150M — это реализация метода распределенного обучения ИИ с низкими затратами на передачу данных. Модель доказывает, что обучать нейросети можно децентрализованно, объединяя вычислительные мощности на разных континентах.

// abstract

OpenDiLoCo is an open-source implementation and replication of the Distributed Low-Communication (DiLoCo) training method for large language models. We provide a reproducible implementation of the DiLoCo experiments, offering it within a scalable, decentralized training framework using the Hivemind library. We demonstrate its effectiveness by training a model across two continents and three countries, while maintaining 90-95% compute utilization. Additionally, we conduct ablations studies focusing on the algorithm's compute efficiency, scalability in the number of workers and show that its gradients can be all-reduced using FP16 without any performance degradation. Furthermore, we scale OpenDiLoCo to 3x the size of the original work, demonstrating its effectiveness for billion parameter models.

// faq

Что такое OpenDiLoCo 150M?+

Кто разработал OpenDiLoCo 150M?+

Какие задачи решает OpenDiLoCo 150M?+

// похожие модели