【Deep Learning研修（発展）】系列データモデリング (RNN / LSTM / Transformer)　第７回「Transformer」

【Deep Learning研修（発展）】（ https://www.youtube.com/playlist?list=PLbtqZvaoOVPA-keirzqx2wzpujxE-fzyt ）はディープラーニング・機械学習に関する発展的な話題を幅広く紹介する研修動画シリーズです。Neural Network Consoleチャンネル（https://www.youtube.com/c/NeuralNetworkConsole/ ）でもディープラーニングに関するより基礎的な内容の解説動画を公開しておりますので、ぜひそちらも御覧ください。

本動画は「系列データモデリング」の第７回の動画です。前回のAttentionに続き、深層学習分野において大きなインパクトを与えた手法であるTransformerについて説明します。

[スライド5] Attention Is All You Need
https://arxiv.org/abs/1706.03762

[スライド11, residual connection] Deep Residual Learning for Image Recognition
https://arxiv.org/abs/1512.03385

[スライド11, layer normalization] Layer Normalization
https://arxiv.org/abs/1607.06450

[スライド12] The Annotated Transformer
https://nlp.seas.harvard.edu/2018/04/03/attention.html

[スライド13] The Illustrated Transformer
http://jalammar.github.io/illustrated-transformer/

[スライド13] 【世界一分かりやすい解説】イラストでみるTransformer
https://tips-memo.com/translation-jayalmmar-transformer

[スライド17] Learning Deep Transformer Models for Machine Translation
https://arxiv.org/abs/1906.01787

[スライド17] Understanding the Difficulty of Training Transformers
https://arxiv.org/abs/2004.08249

[スライド18] Self-Attention with Relative Position Representations
https://arxiv.org/abs/1803.02155

[スライド18] Position Information in Transformers: An Overview
https://arxiv.org/abs/2102.11090

[スライド19, Gaussian Error Linear Units] Gaussian Error Linear Units
https://arxiv.org/abs/1606.08415

[スライド19, Gated Linear Units] Language Modeling with Gated Convolutional Networks
https://arxiv.org/abs/1612.08083

[スライド19] GLU Variants Improve Transformer
https://arxiv.org/abs/2002.05202

[スライド20, Sparse Transformer] Generating Long Sequences with Sparse Transformers
https://arxiv.org/abs/1904.10509

[スライド20, Reformer] Reformer: The Efficient Transformer
https://arxiv.org/abs/2001.04451

[スライド20, Big Bird] Big Bird: Transformers for Longer Sequences
https://arxiv.org/abs/2007.14062

[スライド21] Transformer メタサーベイ
https://www.slideshare.net/cvpaperchallenge/transformer-247407256

[スライド21] Do Transformer Modifications Transfer Across Implementations and Applications?
https://arxiv.org/abs/2102.11972

--
ソニーが提供するオープンソースのディープラーニング（深層学習）フレームワークソフトウェアのNeural Network Libraries（ https://nnabla.org/, https://github.com/sony/nnabla/ ）に関連する情報を紹介する動画チャンネルを開設しました（ https://www.youtube.com/c/nnabla ）。Neural Network Librariesのチュートリアル・Tipsに加え、最先端のディープラーニングの技術情報（講義、最先端論文紹介）などを発信していきます。チャンネル登録と応援よろしくおねがいします！

同じくソニーが提供する直感的なGUIベースの深層学習開発環境のNeural Network Console（ https://dl.sony.com/ ）が発信する大人気のYouTubeチャンネル（ https://www.youtube.com/c/NeuralNetworkConsole/ ）でもディープラーニングの技術講座やツールのチュートリアルを多数公開しています。こちらもチャンネル登録と応援よろしくおねがいします。

【Deep Learning研修（発展）】系列データモデリング (RNN / LSTM / Transformer) 第７回「Transformer」

nnabla ディープラーニングチャンネル

Timetable

よく話題になっている単語

事前学習

結果パート「GANベースのADM」--> 「Diffusion ModelのADM」です

右側の論文タイトルは「Pre-training Vision Transformers with Very Limited Synthesized Images」-->「SegRCDB: Semantic Segmentation via Formula-Driven Supervised Learning」です

Finetuning, adaptor, prompting

人認識（ロバスト性とドメイン汎化性）

人認識（新しいタスクとデータセット）

人認識（一貫性）

3D認識（シーン依存型）

3D認識（シーン非依存型）

まとめ

効率の良いアーキテクチャ

Pruningと量子化

データを使わない・限られた量のデータを用いた量子化とプルーニングの手法が近年提案されています

Lowレベルと物理ベースコンピュータビジョン

AOセンサ向け低ビット量子化の論文を紹介します

Graphics2RAW, GlowGANはそれぞれ以下の論文です．Graphics2RAW: Mapping Computer Graphics Images to Sensor RAW ImagesGlowGAN: Unsupervised Learning of HDR Images from LDR Images in the Wild

Neural architecture search (supernet編)

Neural architecture search (スケーラブル・動的なアーキテクチャ編)

【Deep Learning研修（発展）】系列データモデリング (RNN / LSTM / Transformer)　第７回「Transformer」