Продвинутая языковая модель на базе архитектуры LSTM, реализованная с помощью TensorFlow для работы с датасетом WikiText-2. Этот ИИ-инструмент предлагает готовые скрипты, которые значительно упрощают обучение нейросетей, позволяя быстро достигать высоких результатов в моделировании языка.
Recently, an abundance of deep learning toolkits has been made freely available. These toolkits typically offer the building blocks and sometimes simple example scripts, but designing and training a model still takes a considerable amount of time and knowledge. We present language modeling scripts based on TensorFlow that allow one to train and test competitive models directly, by using a pre-defined configuration or changing it to their needs. There are several options for input features (words, characters, words combined with characters, character n-grams) and for batching (sentence- or discourse-level). The models can be used to test the perplexity, predict the next word(s), re-score hypotheses or generate debugging files for interpolation with n-gram models. Additionally, we make available LSTM language models trained on a variety of Dutch texts and English benchmarks, that can be used immediately, thereby avoiding the time and computationally expensive training process. The toolkit is open source and can be found at https://github.com/lverwimp/tf-lm