Яндекс Метрика
Языковая модель

Fine-tuned-AWD-LSTM-DOC (fin)

Samsung R&D Institute Russia
Языковое моделирование

Команда Samsung R&D Russia представила улучшенную версию AWD-LSTM, оптимизированную с помощью обратной дивергенции Кульбака-Лейблера. Такая тонкая настройка позволяет ИИ достигать превосходного качества генерации текста, минимизируя потери и улучшая языковую связность.

Cross-entropy loss is a common choice when it comes to multiclass classification tasks and language modeling in particular. Minimizing this loss results in language models of very good quality. We show that it is possible to fine-tune these models and make them perform even better if they are fine-tuned with sum of cross-entropy loss and reverse Kullback-Leibler divergence. The latter is estimated using discriminator network that we train in advance. During fine-tuning probabilities of rare words that are usually underestimated by language models become bigger. The novel approach that we propose allows us to reach state-of-the-art quality on Penn Treebank: perplexity decreases from 52.4 to 52.1. Our fine-tuning algorithm is rather fast, scales well to different architectures and datasets and requires almost no hyperparameter tuning: the only hyperparameter that needs to be tuned is learning rate.

Что такое Fine-tuned-AWD-LSTM-DOC (fin)?+
Кто разработал Fine-tuned-AWD-LSTM-DOC (fin)?+
Какие задачи решает Fine-tuned-AWD-LSTM-DOC (fin)?+