Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Stanford University

Stanford Seminar - Transformers in Language: The Development of GPT Models Including GPT-3

Stanford University via YouTube

Overview

This course will introduce the concept of unsupervised learning as a means of natural language processing. In this course, we will examine the different algorithms for building generative models for text and language understanding, including 3-gram models, recurrent neural nets, big LSTM, the Transformer, GPT-2, GPT-3, the Unsupervised Sentiment Neuron, and Zero-Shot Reading Comprehension. We will also explore the possibility of applying GPT-3 to images through IGPT, examining the HumanEval dataset and the pass@k metric as a measure of human evaluation. Finally, we will cover techniques for approximation of sampling against an oracle, while considering the limitations and implications of natural language processing.

Syllabus

Introduction.
3-Gram Model (Shannon 1951).
Recurrent Neural Nets (Sutskever et al 2011).
Big LSTM (Jozefowicz et al 2016).
Transformer (Llu and Saleh et al 2018).
GPT-2: Big Transformer (Radford et al 2019).
GPT-3: Very Big Transformer (Brown et al 2019).
GPT-3: Can Humans Detect Generated News Articles?.
Why Unsupervised Learning?.
Is there a Big Trove of Unlabeled Data?.
Why Use Autoregressive Generative Models for Unsupervised Learnin.
Unsupervised Sentiment Neuron (Radford et al 2017).
Radford et al 2018).
Zero-Shot Reading Comprehension.
GPT-2: Zero-Shot Translation.
Language Model Metalearning.
GPT-3: Few Shot Arithmetic.
GPT-3: Few Shot Word Unscrambling.
GPT-3: General Few Shot Learning.
IGPT (Chen et al 2020): Can we apply GPT to images?.
IGPT: Completions.
IGPT: Feature Learning.
Isn't Code Just Another Modality?.
The HumanEval Dataset.
The Pass @ K Metric.
Codex: Training Details.
An Easy Human Eval Problem (pass@1 -0.9).
A Medium HumanEval Problem (pass@1 -0.17).
A Hard HumanEval Problem (pass@1 -0.005).
Calibrating Sampling Temperature for Pass@k.
The Unreasonable Effectiveness of Sampling.
Can We Approximate Sampling Against an Oracle?.
Main Figure.
Limitations.
Conclusion.
Acknowledgements.

Taught by

Stanford Online

Reviews

Start your review of Stanford Seminar - Transformers in Language: The Development of GPT Models Including GPT-3

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.