JIUTIAN-139MoE — это высокоэффективная языковая модель от China Mobile, построенная на архитектуре Mixture-of-Experts (MoE). ИИ использует систему специализированных «экспертов» для решения сложных промышленных задач, генерации кода и глубокой аналитики.
We report the development of JIUTIAN-139MoE, a 13-billion active parameter language model designed to be an efficient foundation model for industrial use. It adopts a decoder-only Transformerbased Mixture-of-Experts (MoE) architecture, employing a pair of large twin experts and six small experts to capture the intelligence associated with diverse industries. In terms of training, we support training with clusters of various GPUs and NPUs. We also support lossless switch between two heterogeneous clusters. In addition, JIUTIAN-139MoE-Chat, a fine-tuned version of JIUTIAN139MoE, surpasses state-of-the-art large language models on both open and self-built industrystandard benchmarks. Specifically, it exhibits outstanding performances on 10 industrial benchmarks and leading performances on 10 benchmarks of general knowledge understanding, reasoning, math and coding capabilities. JIUTIAN-139MoE is released under the Apache 2.0 license and JIUTIAN Large Model Community License Agreement. It is publicly available at https://jiutian.10086. cn/qdlake/qdh-web/#/model/detail/1070.