TCN (P-MNIST): распознавание цифр через свертки

Q: Кто разработал TCN (P-MNIST)?

Модель TCN (P-MNIST) разработана компанией Carnegie Mellon University (CMU),Intel Labs (United States of America,United States of America).

Q: Какие задачи решает TCN (P-MNIST)?

Классификация изображений, Digit recognition

// задачи

Классификация изображенийDigit recognition

// описание

Модель TCN (P-MNIST) применяет силу сверточных сетей для распознавания рукописных цифр в формате последовательностей. Этот AI-алгоритм доказывает, что сверточная архитектура может быть быстрее и точнее привычных рекуррентных методов в задачах классификации.

// abstract

This paper revisits the problem of sequence modeling using convolutional architectures. Although both convolutional and recurrent architectures have a long history in sequence prediction, the current "default" mindset in much of the deep learning community is that generic sequence modeling is best handled using recurrent networks. The goal of this paper is to question this assumption. Specifically, we consider a simple generic temporal convolution network (TCN), which adopts features from modern ConvNet architectures such as a dilations and residual connections. We show that on a variety of sequence modeling tasks, including many frequently used as benchmarks for evaluating recurrent networks, the TCN outperforms baseline RNN methods (LSTMs, GRUs, and vanilla RNNs) and sometimes even highly specialized approaches. We further show that the potential "infinite memory" advantage that RNNs have over TCNs is largely absent in practice: TCNs indeed exhibit longer effective history sizes than their recurrent counterparts. As a whole, we argue that it may be time to (re)considerConvNets as the default "go to" architecture for sequence modeling.

// faq

Что такое TCN (P-MNIST)?+

Кто разработал TCN (P-MNIST)?+

Какие задачи решает TCN (P-MNIST)?+

// похожие модели