QRNN: быстрая нейросеть для обработки текстов

Q: Кто разработал QRNN?

Модель QRNN разработана компанией Salesforce Research (United States of America).

// задачи

Языковое моделирование

// описание

Архитектура QRNN объединяет лучшее от рекуррентных и сверточных сетей для сверхбыстрого языкового моделирования. Этот ИИ-подход решает проблему медленного обучения LSTM, позволяя обрабатывать огромные текстовые корпуса с высокой эффективностью.

// abstract

Word-level language modeling (WLM) is one the foundational tasks of unsupervised natural language processing. Most modern architectures for WLM use several LSTM layers, followed by a softmax layer. Even with larger batch sizes and a multi-GPU setup, training of these networks on large-vocabulary corpora is slow due to increased computation involving the softmax and the high cost of recurrence computation. We propose a model architecture and training strategy that enables us to achieve state-of-the-art performance on the WikiText-103 data set using a single GPU while being substantially faster than an NVIDIA cuDNN LSTM-based model by utilizing the Quasi-Recurrent Neural Network (QRNN), an adaptive softmax with weight tying, and longer sequences within batches.

// faq

Что такое QRNN?+

Кто разработал QRNN?+

Какие задачи решает QRNN?+

// похожие модели