Яндекс Метрика
Языковая модель

NAS+ESS (156M)

Northeastern University (China),Chinese Academy of Sciences,NiuTrans Research,Kingsoft
Neural Architecture Search - NASЯзыковое моделирование

Модель NAS+ESS (156M) расширяет границы автоматического поиска нейроархитектур, выходя за рамки стандартных ячеек. Этот ИИ-алгоритм оптимизирует связи как внутри, так и между блоками, обеспечивая высокую производительность в задачах языкового моделирования.

Neural architecture search (NAS) has advanced significantly in recent years but most NAS systems restrict search to learning architectures of a recurrent or convolutional cell. In this paper, we extend the search space of NAS. In particular, we present a general approach to learn both intra-cell and inter-cell architectures (call it ESS). For a better search result, we design a joint learning method to perform intra-cell and inter-cell NAS simultaneously. We implement our model in a differentiable architecture search system. For recurrent neural language modeling, it outperforms a strong baseline significantly on the PTB and WikiText data, with a new state-of-the-art on PTB. Moreover, the learned architectures show good transferability to other systems. E.g., they improve state-of-the-art systems on the CoNLL and WNUT named entity recognition (NER) tasks and CoNLL chunking task, indicating a promising line of research on large-scale pre-learned architectures.

Что такое NAS+ESS (156M)?+
Кто разработал NAS+ESS (156M)?+
Какие задачи решает NAS+ESS (156M)?+