AWD-LSTM: Регуляризация и мощь языковых моделей ИИ

Q: Кто разработал AWD-LSTM?

Модель AWD-LSTM разработана компанией DeepMind,University of Oxford (United Kingdom of Great Britain and Northern Ireland,United Kingdom of Great Britain and Northern Ireland).

// задачи

Языковое моделирование

// описание

AWD-LSTM — это эталон регуляризации в мире NLP, созданный командами DeepMind и Оксфорда. Модель использует метод DropConnect, устанавливая новые стандарты точности (SOTA) для рекуррентных нейросетей в задачах обработки текста.

// abstract

Ongoing innovations in recurrent neural network architectures have provided a steady influx of apparently state-of-the-art results on language modelling benchmarks. However, these have been evaluated using differing code bases and limited computational resources, which represent uncontrolled sources of experimental variation. We reevaluate several popular architectures and regularisation methods with large-scale automatic black-box hyperparameter tuning and arrive at the somewhat surprising conclusion that standard LSTM architectures, when properly regularised, outperform more recent models. We establish a new state of the art on the Penn Treebank and Wikitext-2 corpora, as well as strong baselines on the Hutter Prize dataset.

// faq

Что такое AWD-LSTM?+

Кто разработал AWD-LSTM?+

Какие задачи решает AWD-LSTM?+

// похожие модели