Zero-Shot Reading Comprehension(00:15:21 - 00:16:44) - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

Zero-Shot Reading Comprehension(00:15:21 - 00:16:44)
Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

While the Transformer architecture is used in a variety of applications across a number of domains, it first found success in natural language. Today, Transformers remain the de facto model in language - they achieve state-of-the-art results on most natural language benchmarks, and can generate t...
While the Transformer architecture is used in a variety of applications across a number of domains, it first found success in natural language. Today, Transformers remain the de facto model in language - they achieve state-of-the-art results on most natural language benchmarks, and can generate text coherent enough to deceive human readers. In this talk, we will review recent progress in neural language modeling, discuss the link between generating text and solving downstream tasks, and explore how this led to the development of GPT models at OpenAI. Next, we’ll see how the same approach can be used to produce generative models and strong representations in other domains like images, text-to-image, and code. Finally, we will dive into the recently released code generating model, Codex, and examine this particularly interesting domain of study.

Mark Chen is a research scientist at OpenAI, where he manages the Algorithms Team. His research interests include generative modeling and representation learning, especially in the image and multimodal domains. Prior to OpenAI, Mark worked in high frequency trading and graduated from MIT. Mark is also a coach for the USA Computing Olympiad team.

View the entire CS25 Transformers United playlist: https://www.youtube.com/playlist?list=PLoROMvodv4rNiJRchCzutFw5ItR_Z27CM

0:00 Introduction
0:08 3-Gram Model (Shannon 1951)
0:27 Recurrent Neural Nets (Sutskever et al 2011)
1:12 Big LSTM (Jozefowicz et al 2016)
1:52 Transformer (Llu and Saleh et al 2018)
2:33 GPT-2: Big Transformer (Radford et al 2019)
3:38 GPT-3: Very Big Transformer (Brown et al 2019)
5:12 GPT-3: Can Humans Detect Generated News Articles?
9:09 Why Unsupervised Learning?
10:38 Is there a Big Trove of Unlabeled Data?
11:11 Why Use Autoregressive Generative Models for Unsupervised Learnin
13:00 Unsupervised Sentiment Neuron (Radford et al 2017)
14:11 Radford et al 2018)
15:21 Zero-Shot Reading Comprehension
16:44 GPT-2: Zero-Shot Translation
18:15 Language Model Metalearning
19:23 GPT-3: Few Shot Arithmetic
20:14 GPT-3: Few Shot Word Unscrambling
20:36 GPT-3: General Few Shot Learning
23:42 IGPT (Chen et al 2020): Can we apply GPT to images?
25:31 IGPT: Completions
26:24 IGPT: Feature Learning
32:20 Isn't Code Just Another Modality?
33:33 The HumanEval Dataset
36:00 The Pass @ K Metric
36:59 Codex: Training Details
38:03 An Easy Human Eval Problem (pass@1 -0.9)
38:36 A Medium HumanEval Problem (pass@1 -0.17)
39:00 A Hard HumanEval Problem (pass@1 -0.005)
41:26 Calibrating Sampling Temperature for Pass@k
42:19 The Unreasonable Effectiveness of Sampling
43:17 Can We Approximate Sampling Against an Oracle?
45:52 Main Figure
46:53 Limitations
47:38 Conclusion
48:19 Acknowledgements

#gpt3

#Stanford #GPT Models #GPT3 #OpenAI #Stanford Online #Transformers #AI #Artificial intelligence
Introduction - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

Introduction

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:00:00 - 00:00:08
3-Gram Model (Shannon 1951) - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

3-Gram Model (Shannon 1951)

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:00:08 - 00:00:27
Recurrent Neural Nets (Sutskever et al 2011) - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

Recurrent Neural Nets (Sutskever et al 2011)

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:00:27 - 00:01:12
Big LSTM (Jozefowicz et al 2016) - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

Big LSTM (Jozefowicz et al 2016)

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:01:12 - 00:01:52
Transformer (Llu and Saleh et al 2018) - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

Transformer (Llu and Saleh et al 2018)

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:01:52 - 00:02:33
GPT-2: Big Transformer (Radford et al 2019) - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

GPT-2: Big Transformer (Radford et al 2019)

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:02:33 - 00:03:38
GPT-3: Very Big Transformer (Brown et al 2019) - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

GPT-3: Very Big Transformer (Brown et al 2019)

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:03:38 - 00:05:12
GPT-3: Can Humans Detect Generated News Articles? - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

GPT-3: Can Humans Detect Generated News Articles?

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:05:12 - 00:09:09
Why Unsupervised Learning? - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

Why Unsupervised Learning?

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:09:09 - 00:10:38
Is there a Big Trove of Unlabeled Data? - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

Is there a Big Trove of Unlabeled Data?

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:10:38 - 00:11:11
Why Use Autoregressive Generative Models for Unsupervised Learnin - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

Why Use Autoregressive Generative Models for Unsupervised Learnin

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:11:11 - 00:13:00
Unsupervised Sentiment Neuron (Radford et al 2017) - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

Unsupervised Sentiment Neuron (Radford et al 2017)

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:13:00 - 00:14:11
Radford et al 2018) - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

Radford et al 2018)

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:14:11 - 00:15:21
Zero-Shot Reading Comprehension - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

Zero-Shot Reading Comprehension

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:15:21 - 00:16:44
GPT-2: Zero-Shot Translation - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

GPT-2: Zero-Shot Translation

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:16:44 - 00:18:15
Language Model Metalearning - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

Language Model Metalearning

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:18:15 - 00:19:23
GPT-3: Few Shot Arithmetic - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

GPT-3: Few Shot Arithmetic

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:19:23 - 00:20:14
GPT-3: Few Shot Word Unscrambling - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

GPT-3: Few Shot Word Unscrambling

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:20:14 - 00:20:36
GPT-3: General Few Shot Learning - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

GPT-3: General Few Shot Learning

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:20:36 - 00:23:42
IGPT (Chen et al 2020): Can we apply GPT to images? - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

IGPT (Chen et al 2020): Can we apply GPT to images?

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:23:42 - 00:25:31
IGPT: Completions - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

IGPT: Completions

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:25:31 - 00:26:24
IGPT: Feature Learning - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

IGPT: Feature Learning

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:26:24 - 00:32:20
Isn't Code Just Another Modality? - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

Isn't Code Just Another Modality?

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:32:20 - 00:33:33
The HumanEval Dataset - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

The HumanEval Dataset

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:33:33 - 00:36:00
The Pass @ K Metric - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

The Pass @ K Metric

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:36:00 - 00:36:59
Codex: Training Details - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

Codex: Training Details

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:36:59 - 00:38:03
An Easy Human Eval Problem (pass@1 -0.9) - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

An Easy Human Eval Problem (pass@1 -0.9)

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:38:03 - 00:38:36
A Medium HumanEval Problem (pass@1 -0.17) - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

A Medium HumanEval Problem (pass@1 -0.17)

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:38:36 - 00:39:00
A Hard HumanEval Problem (pass@1 -0.005) - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

A Hard HumanEval Problem (pass@1 -0.005)

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:39:00 - 00:41:26
Calibrating Sampling Temperature for Pass@k - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

Calibrating Sampling Temperature for Pass@k

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:41:26 - 00:42:19
The Unreasonable Effectiveness of Sampling - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

The Unreasonable Effectiveness of Sampling

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:42:19 - 00:43:17
Can We Approximate Sampling Against an Oracle? - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

Can We Approximate Sampling Against an Oracle?

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:43:17 - 00:45:52
Main Figure - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

Main Figure

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:45:52 - 00:46:53
Limitations - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

Limitations

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:46:53 - 00:47:38
Conclusion - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

Conclusion

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:47:38 - 00:48:19
Acknowledgements - Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

Acknowledgements

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3
2022年07月12日 
00:48:19 - 00:48:39

Stanford Online

※本サイトに掲載されているチャンネル情報や動画情報はYouTube公式のAPIを使って取得・表示しています。

Timetable

動画タイムテーブル

動画数:2418件