RSS

「Transformer」 - おすすめピックアップ動画

※本サイトに掲載されているチャンネル情報や動画情報はYouTube公式のAPIを使って取得・表示しています。

Videos

動画一覧

動画数:135件

Create a Large Language Model from Scratch with Python – Tutorial

Create a Large Language Model from Scratch with Python – Tutorial

Learn how to build your own large language model, from scratch. This course goes into the data handling, math, and transformers behind large language models. You will use Python. ✏️ Course developed by @elliotarledge 💻 Code and course resources: https://github.com/Infatoshi/fcc-intro-to-llms Join Elliot's Discord server: https://discord.gg/pV7ByF9VNm Elliot on X: https://twitter.com/elliotarledge ⭐️ Contents ⭐️ (0:00:00) Intro (0:03:25) Install Libraries (0:06:24) Pylzma build tools (0:08:58) Jupyter Notebook (0:12:11) Download wizard of oz (0:14:51) Experimenting with text file (0:17:58) Character-level tokenizer (0:19:44) Types of tokenizers (0:20:58) Tensors instead of Arrays (0:22:37) Linear Algebra heads up (0:23:29) Train and validation splits (0:25:30) Premise of Bigram Model (0:26:41) Inputs and Targets (0:29:29) Inputs and Targets Implementation (0:30:10) Batch size hyperparameter (0:32:13) Switching from CPU to CUDA (0:33:28) PyTorch Overview (0:42:49) CPU vs GPU performance in PyTorch (0:47:49) More PyTorch Functions (1:06:03) Embedding Vectors (1:11:33) Embedding Implementation (1:13:06) Dot Product and Matrix Multiplication (1:25:42) Matmul Implementation (1:26:56) Int vs Float (1:29:52) Recap and get_batch (1:35:07) nnModule subclass (1:37:05) Gradient Descent (1:50:53) Logits and Reshaping (1:59:28) Generate function and giving the model some context (2:03:58) Logits Dimensionality (2:05:17) Training loop + Optimizer + Zerograd explanation (2:13:56) Optimizers Overview (2:17:04) Applications of Optimizers (2:18:11) Loss reporting + Train VS Eval mode (2:32:54) Normalization Overview (2:35:45) ReLU, Sigmoid, Tanh Activations (2:45:15) Transformer and Self-Attention (2:46:55) Transformer Architecture (3:17:54) Building a GPT, not Transformer model (3:19:46) Self-Attention Deep Dive (3:25:05) GPT architecture (3:27:07) Switching to Macbook (3:31:42) Implementing Positional Encoding (3:36:57) GPTLanguageModel initalization (3:40:52) GPTLanguageModel forward pass (3:46:56) Standard Deviation for model parameters (4:00:50) Transformer Blocks (4:04:54) FeedForward network (4:07:53) Multi-head Attention (4:12:49) Dot product attention (4:19:43) Why we scale by 1/sqrt(dk) (4:26:45) Sequential VS ModuleList Processing (4:30:47) Overview Hyperparameters (4:32:14) Fixing errors, refining (4:34:01) Begin training (4:35:46) OpenWebText download and Survey of LLMs paper (4:37:56) How the dataloader/batch getter will have to change (4:41:20) Extract corpus with winrar (4:43:44) Python data extractor (4:49:23) Adjusting for train and val splits (4:57:55) Adding dataloader (4:59:04) Training on OpenWebText (5:02:22) Training works well, model loading/saving (5:04:18) Pickling (5:05:32) Fixing errors + GPU Memory in task manager (5:14:05) Command line argument parsing (5:18:11) Porting code to script (5:22:04) Prompt: Completion feature + more errors (5:24:23) nnModule inheritance + generation cropping (5:27:54) Pretraining vs Finetuning (5:33:07) R&D pointers (5:44:38) Outro 🎉 Thanks to our Champion and Sponsor supporters: 👾 davthecoder 👾 jedi-or-sith 👾 南宮千影 👾 Agustín Kussrow 👾 Nattira Maneerat 👾 Heather Wcislo 👾 Serhiy Kalinets 👾 Justin Hual 👾 Otis Morgan -- Learn to code for free and get a developer job: https://www.freecodecamp.org Read hundreds of articles on programming: https://freecodecamp.org/news
2023年08月26日
00:00:00 - 05:43:41
Stanford CS25: V2 I Introduction to Transformers w/ Andrej Karpathy

Stanford CS25: V2 I Introduction to Transformers w/ Andrej Karpathy

January 10, 2023 Introduction to Transformers Andrej Karpathy: https://karpathy.ai/ Since their introduction in 2017, transformers have revolutionized Natural Language Processing (NLP). Now, transformers are finding applications all over Deep Learning, be it computer vision (CV), reinforcement learning (RL), Generative Adversarial Networks (GANs), Speech or even Biology. Among other things, transformers have enabled the creation of powerful language models like GPT-3 and were instrumental in DeepMind's recent AlphaFold2, that tackles protein folding. In this speaker series, we examine the details of how transformers work, and dive deep into the different kinds of transformers and how they're applied in different fields. We do this by inviting people at the forefront of transformers research across different domains for guest lectures. More about the course can be found here: https://web.stanford.edu/class/cs25/ View the entire CS25 Transformers United playlist: https://www.youtube.com/playlist?list=PLoROMvodv4rNiJRchCzutFw5ItR_Z27CM 0:00 Introduction 0:47 Introducing the Course 3:19 Basics of Transformers 3:35 The Attention Timeline 5:01 Prehistoric Era 6:10 Where we were in 2021 7:30 The Future 10:15 Transformers - Andrej Karpathy 10:39 Historical context 1:00:30 Thank you - Go forth and transform #Stanford #Stanford Online
2023年05月20日
00:00:00 - 01:11:41
Deep Learning for Computer Vision with Python and TensorFlow – Complete Course

Deep Learning for Computer Vision with Python and TensorFlow – Complete Course

Learn the basics of computer vision with deep learning and how to implement the algorithms using Tensorflow. Author: Folefac Martins from Neuralearn.ai More Courses: www.neuralearn.ai Link to Code: https://colab.research.google.com/drive/18u1KDx-9683iZNPxSDZ6dOv9319ZuEC_ YouTube Channel: https://www.youtube.com/@neuralearn ⭐️ Contents ⭐️ Introduction ⌨️ (0:00:00) Welcome ⌨️ (0:05:54) Prerequisite ⌨️ (0:06:11) What we shall Learn Tensors and Variables ⌨️ (0:12:12) Basics ⌨️ (0:19:26) Initialization and Casting ⌨️ (1:07:31) Indexing ⌨️ (1:16:15) Maths Operations ⌨️ (1:55:02) Linear Algebra Operations ⌨️ (2:56:21) Common TensorFlow Functions ⌨️ (3:50:15) Ragged Tensors ⌨️ (4:01:41) Sparse Tensors ⌨️ (4:04:23) String Tensors ⌨️ (4:07:45) Variables Building Neural Networks with TensorFlow [Car Price Prediction] ⌨️ (4:14:52) Task Understanding ⌨️ (4:19:47) Data Preparation ⌨️ (4:54:47) Linear Regression Model ⌨️ (5:10:18) Error Sanctioning ⌨️ (5:24:53) Training and Optimization ⌨️ (5:41:22) Performance Measurement ⌨️ (5:44:18) Validation and Testing ⌨️ (6:04:30) Corrective Measures Building Convolutional Neural Networks with TensorFlow [Malaria Diagnosis] ⌨️ (6:28:50) Task Understanding ⌨️ (6:37:40) Data Preparation ⌨️ (6:57:40) Data Visualization ⌨️ (7:00:20) Data Processing ⌨️ (7:08:50) How and Why ConvNets Work ⌨️ (7:56:15) Building Convnets with TensorFlow ⌨️ (8:02:39) Binary Crossentropy Loss ⌨️ (8:10:15) Training Convnets ⌨️ (8:23:33) Model Evaluation and Testing ⌨️ (8:29:15) Loading and Saving Models to Google Drive Building More Advanced Models in Teno Convolutional Neural Networks with TensorFlow [Malaria Diagnosis] ⌨️ (8:47:10) Functional API ⌨️ (9:03:48) Model Subclassing ⌨️ (9:19:05) Custom Layers Evaluating Classification Models [Malaria Diagnosis] ⌨️ (9:36:45) Precision, Recall and Accuracy ⌨️ (10:00:35) Confusion Matrix ⌨️ (10:10:10) ROC Plots Improving Model Performance [Malaria Diagnosis] ⌨️ (10:18:10) TensorFlow Callbacks ⌨️ (10:43:55) Learning Rate Scheduling ⌨️ (11:01:25) Model Checkpointing ⌨️ (11:09:25) Mitigating Overfitting and Underfitting Data Augmentation [Malaria Diagnosis] ⌨️ (11:38:50) Augmentation with tf.image and Keras Layers ⌨️ (12:38:00) Mixup Augmentation ⌨️ (12:56:35) Cutmix Augmentation ⌨️ (13:38:30) Data Augmentation with Albumentations Advanced TensorFlow Topics [Malaria Diagnosis] ⌨️ (13:58:35) Custom Loss and Metrics ⌨️ (14:18:30) Eager and Graph Modes ⌨️ (14:31:23) Custom Training Loops Tensorboard Integration [Malaria Diagnosis] ⌨️ (14:57:00) Data Logging ⌨️ (15:29:00) View Model Graphs ⌨️ (15:31:45) Hyperparameter Tuning ⌨️ (15:52:40) Profiling and Visualizations MLOps with Weights and Biases [Malaria Diagnosis] ⌨️ (16:00:35) Experiment Tracking ⌨️ (16:55:02) Hyperparameter Tuning ⌨️ (17:17:15) Dataset Versioning ⌨️ (18:00:23) Model Versioning Human Emotions Detection ⌨️ (18:16:55) Data Preparation ⌨️ (18:45:38) Modeling and Training ⌨️ (19:36:42) Data Augmentation ⌨️ (19:54:30) TensorFlow Records Modern Convolutional Neural Networks [Human Emotions Detection] ⌨️ (20:31:25) AlexNet ⌨️ (20:48:35) VGGNet ⌨️ (20:59:50) ResNet ⌨️ (21:34:07) Coding ResNet from Scratch ⌨️ (21:56:17) MobileNet ⌨️ (22:20:43) EfficientNet Transfer Learning [Human Emotions Detection] ⌨️ (22:38:15) Feature Extraction ⌨️ (23:02:25) Finetuning Understanding the Blackbox [Human Emotions Detection] ⌨️ (23:15:33) Visualizing Intermediate Layers ⌨️ (23:36:20) Gradcam method Transformers in Vision [Human Emotions Detection] ⌨️ (23:57:35) Understanding ViTs ⌨️ (24:51:17) Building ViTs from Scratch ⌨️ (25:42:39) FineTuning Huggingface ViT ⌨️ (26:05:52) Model Evaluation with Wandb Model Deployment [Human Emotions Detection] ⌨️ (26:27:13) Converting TensorFlow Model to Onnx format ⌨️ (26:52:26) Understanding Quantization ⌨️ (27:13:08) Practical Quantization of Onnx Model ⌨️ (27:22:01) Quantization Aware Training ⌨️ (27:39:55) Conversion to TensorFlow Lite ⌨️ (27:58:28) How APIs work ⌨️ (28:18:28) Building an API with FastAPI ⌨️ (29:39:10) Deploying API to the Cloud ⌨️ (29:51:35) Load Testing with Locust Object Detection with YOLO ⌨️ (30:05:29) Introduction to Object Detection ⌨️ (30:11:39) Understanding YOLO Algorithm ⌨️ (31:15:17) Dataset Preparation ⌨️ (31:58:27) YOLO Loss ⌨️ (33:02:58) Data Augmentation ⌨️ (33:27:33) Testing Image Generation ⌨️ (33:59:28) Introduction to Image Generation ⌨️ (34:03:18) Understanding Variational Autoencoders ⌨️ (34:20:46) VAE Training and Digit Generation ⌨️ (35:06:05) Latent Space Visualization ⌨️ (35:21:36) How GANs work ⌨️ (35:43:30) The GAN Loss ⌨️ (36:01:38) Improving GAN Training ⌨️ (36:25:02) Face Generation with GANs Conclusion ⌨️ (37:15:45) What's Next
2023年06月06日
00:00:00 - 37:16:41
Azure AI Fundamentals Certification 2024 (AI-900) - Full Course to PASS the Exam

Azure AI Fundamentals Certification 2024 (AI-900) - Full Course to PASS the Exam

Prepare for the Azure AI Fundamentals Certification (AI-900) and pass! This course has been updated for 2024. ✏️ Course developed by Andrew Brown of ExamPro. @ExamProChannel 🔗 ExamPro Cloud Obsessed Certification Training: https://www.exampro.co ⭐️ Contents ⭐️ ☁️ Introduction 🎤 (00:00:00) Introduction to AI-900 🎤 (00:08:18) Exam Guide Breakdown ☁️ ML Introduction 🎤 (00:12:51) Layers of Machine Learning 🎤 (00:13:59) Key Elements of AI 🎤 (00:14:57) DataSets 🎤 (00:16:37) Labeling 🎤 (00:17:43) Supervised and Unsupervised Reinforcement 🎤 (00:19:09) Netural Networks and Deep Learning 🎤 (00:21:25) GPU 🎤 (00:22:21) CUDA 🎤 (00:23:29) Simple ML Pipeline 🎤 (00:25:39) Forecast vs Prediction 🎤 (00:26:24) Metrics 🎤 (00:27:58) Juypter Notebooks 🎤 (00:29:13) Regression 🎤 (00:30:50) Classification 🎤 (00:31:44) Clustering 🎤 (00:32:29) Confusion Matrix ☁️ Common AI Workloads 🎤 (00:34:06) Anomaly Detection AI 🎤 (00:34:59) Computer Vision AI 🎤 (00:37:05) Natural Language Processing AI 🎤 (00:38:42) Conversational AI ☁️ Responsible AI 🎤 (00:40:16) Responsible AI 🎤 (00:41:09) Fairness 🎤 (00:42:08) Reliability and safety 🎤 (00:43:00) Privacy and security 🎤 (00:43:45) Inclusiveness 🎤 (00:44:24) Transparency 🎤 (00:45:00) Accountability 🎤 (00:45:45) Guidelines for Human AI Interaction 🎤 (00:46:04) Follow Along Guidelines for Human AI Interaction ☁️ Congitive Services 🎤 (00:57:33) Azure Cognitive Services 🎤 (00:59:41) Congitive API Key and Endpoint 🎤 (01:00:08) Knowledge Mining 🎤 (01:04:42) Face Service 🎤 (01:06:30) Speech and Translate Service 🎤 (01:08:04) Text Analytics 🎤 (01:11:02) OCR Computer Vision 🎤 (01:12:22) Form Recognizer 🎤 (01:14:48) Form Recognizer Custom Models 🎤 (01:15:34) Form Recognizer Prebuilt Models 🎤 (01:17:33) LUIS 🎤 (01:19:58) QnA Maker 🎤 (01:24:19) Azure Bot Service ☁️ ML Studio 🎤 (01:26:45) Azure Machine Learning Service 🎤 (01:28:10) Studio Overview 🎤 (01:29:39) Studio Compute 🎤 (01:30:48) Studio Data Labeling 🎤 (01:31:45) Data Stores 🎤 (01:32:34) Datasets 🎤 (01:33:44) Experiments 🎤 (01:34:16) Pipelines 🎤 (01:35:23) ML Designer 🎤 (01:36:07) Model Registry 🎤 (01:36:34) Endpoints 🎤 (01:37:50) Notebooks ☁️ AutoML 🎤 (01:38:41) Introduction to AutoML 🎤 (01:41:15) Data Guard Rails 🎤 (01:42:01) Automatic Featurization 🎤 (01:43:53) Model Selection 🎤 (01:44:57) Explanation 🎤 (01:45:51) Primary Metrics 🎤 (01:47:43) Validation Type ☁️ Custom Vision 🎤 (01:48:14) Introduction to Custom Vision 🎤 (01:48:58) Project Types and Domains 🎤 (01:51:54) Custom Vision Features ☁️ Features of generative AI solutions 🎤 (01:54:32) AI vs Generative AI 🎤 (01:57:17) What is a LLM Large Language Model 🎤 (01:58:58) Transformer models 🎤 (02:00:14) Tokenization 🎤 (02:01:26) Embeddings 🎤 (02:02:46) Positional encoding 🎤 (02:04:27) Attention ☁️ Capabilities of Azure OpenAI Service 🎤 (02:08:01) Introduction to Azure OpenAI Service 🎤 (02:10:29) Azure OpenAI Studio 🎤 (02:11:44) Azure OpenAI service pricing 🎤 (02:13:14) What are Copilots 🎤 (02:15:43) Prompt engineering 🎤 (02:18:51) Grounding 🎤 (02:20:36) Copilot demo ☁️ Follow Alongs 🎤 (02:24:04) Setup 🎤 (02:35:02) Computer Vision 🎤 (02:38:44) Custom Vision Classification 🎤 (02:45:22) Custom Vision Object Detection 🎤 (02:51:18) Face Service 🎤 (02:54:30) Form Recognizer 🎤 (02:58:01) OCR 🎤 (03:02:54) Text Analysis 🎤 (03:06:37) QnAMaker 🎤 (03:25:11) LUIS 🎤 (03:30:56) AutoML 🎤 (03:48:13) Designer 🎤 (03:58:31) MNIST 🎤 (04:18:10) Data Labeling 🎤 (04:22:38) Clean up
2024年02月21日
00:00:00 - 04:23:51
Stanford CS224N: NLP with Deep Learning | Winter 2021 | Lecture 1 - Intro & Word Vectors

Stanford CS224N: NLP with Deep Learning | Winter 2021 | Lecture 1 - Intro & Word Vectors

For more information about Stanford's Artificial Intelligence professional and graduate programs visit: https://stanford.io/3w46jar This lecture covers: 1. The course (10min) 2. Human language and word meaning (15 min) 3. Word2vec algorithm introduction (15 min) 4. Word2vec objective function gradients (25 min) 5. Optimization basics (5min) 6. Looking at word vectors (10 min or less) Key learning: The (really surprising!) result that word meaning can be representing rather well by a large vector of real numbers. This course will teach: 1. The foundations of the effective modern methods for deep learning applied to NLP. Basics first, then key methods used in NLP: recurrent networks, attention, transformers, etc. 2. A big picture understanding of human languages and the difficulties in understanding and producing them 3. An understanding of an ability to build systems (in Pytorch) for some of the major problems in NLP. Word meaning, dependency parsing, machine translation, question answering. To learn more about this course visit: https://online.stanford.edu/courses/cs224n-natural-language-processing-deep-learning To follow along with the course schedule and syllabus visit: http://web.stanford.edu/class/cs224n/ Professor Christopher Manning Thomas M. Siebel Professor in Machine Learning, Professor of Linguistics and of Computer Science Director, Stanford Artificial Intelligence Laboratory (SAIL) 0:00 Introduction 1:43 Goals 3:10 Human Language 10:07 Google Translate 10:43 GPT 14:13 Meaning 16:19 Wordnet 19:11 Word Relationships 20:27 Distributional Semantics 23:33 Word Embeddings 27:31 Word tovec 37:55 How to minimize loss 39:55 Interactive whiteboard 41:10 Gradient 48:50 Chain Rule #Natural language #Natural Language Processing #Deep Learning #Stanford AI Lectures #Stanford Graduate courses #Computer science #language understanding #Stanford #Stanford Online
2021年10月29日
00:00:00 - 01:24:27
Stanford CS25: V3 I Retrieval Augmented Language Models

Stanford CS25: V3 I Retrieval Augmented Language Models

December 5, 2023 Douwe Kiela, Contextual AI Language models have led to amazing progress, but they also have important shortcomings. One solution for many of these shortcomings is retrieval augmentation. I will introduce the topic, survey recent literature on retrieval augmented language models and finish with some of the main open questions. More about the course can be found here: https://web.stanford.edu/class/cs25/ View the entire CS25 Transformers United playlist: https://www.youtube.com/playlist?list=PLoROMvodv4rNiJRchCzutFw5ItR_Z27CM #Stanford #Stanford Online
2024年01月26日
00:00:00 - 01:19:27
Stanford CS25: V1 I Transformers United: DL Models that have revolutionized NLP, CV, RL

Stanford CS25: V1 I Transformers United: DL Models that have revolutionized NLP, CV, RL

Since their introduction in 2017, transformers have revolutionized Natural Language Processing (NLP). Now, transformers are finding applications all over Deep Learning, be it computer vision (CV), reinforcement learning (RL), Generative Adversarial Networks (GANs), Speech or even Biology. Among other things, transformers have enabled the creation of powerful language models like GPT-3 and were instrumental in DeepMind's recent AlphaFold2, that tackles protein folding. In this speaker series, we examine the details of how transformers work, and dive deep into the different kinds of transformers and how they're applied in different fields. We do this by inviting people at the forefront of transformers research across different domains for guest lectures. More about the course can be found here: https://web.stanford.edu/class/cs25/ View the entire CS25 Transformers United playlist: https://www.youtube.com/playlist?list=PLoROMvodv4rNiJRchCzutFw5ItR_Z27CM 0:00 Introduction 2:43 Overview of Transformers 6:03 Attention mechanisms 7:53 Self retention 11:38 Other necessary ingredients 13:32 Encoder Decoder Architecture 16:02 Advantages & Disadvantages 18:04 Applications of Transformers #Stanford #Stanford Online #Transformers #Deep Learning #Reinforcement Learning #Computer Vision #AI #Artificial Intelligence
2022年07月09日
00:00:00 - 00:22:44
Stanford CS224N NLP with Deep Learning | 2023 | Lecture 8 - Self-Attention and Transformers

Stanford CS224N NLP with Deep Learning | 2023 | Lecture 8 - Self-Attention and Transformers

For more information about Stanford's Artificial Intelligence professional and graduate programs visit: https://stanford.io/ai This lecture covers: 1. From recurrence (RNN) to attention-based NLP models 2. The Transformer model 3. Great results with Transformers 4. Drawbacks and variants of Transformers To learn more about this course visit: https://online.stanford.edu/courses/c... To follow along with the course schedule and syllabus visit: http://web.stanford.edu/class/cs224n/ John Hewitt https://nlp.stanford.edu/~johnhew/ Professor Christopher Manning Thomas M. Siebel Professor in Machine Learning, Professor of Linguistics and of Computer Science Director, Stanford Artificial Intelligence Laboratory (SAIL) #naturallanguageprocessing #deeplearning #Stanford #Stanford Online
2023年09月20日
00:00:00 - 01:17:04
Generative Python Transformer p.1 - Acquiring Raw Data

Generative Python Transformer p.1 - Acquiring Raw Data

In a quest to teach neural networks via transformers to write Python code. Project name: Generative Python Transformers! Neural Networks from Scratch book: https://nnfs.io Channel membership: https://www.youtube.com/channel/UCfzlCWGWYyIQ0aLC5w48gBQ/join Discord: https://discord.gg/sentdex Reddit: https://www.reddit.com/r/sentdex/ Support the content: https://pythonprogramming.net/support-donate/ Twitter: https://twitter.com/sentdex Instagram: https://instagram.com/sentdex Facebook: https://www.facebook.com/pythonprogramming.net/ Twitch: https://www.twitch.tv/sentdex #python #programming
2021年05月03日
00:00:00 - 00:36:10
Complete Natural Language Processing (NLP) Tutorial in Python! (with examples)

Complete Natural Language Processing (NLP) Tutorial in Python! (with examples)

In this video we go through the major concepts in natural language processing using Python libraries! We use examples to help drill down the concepts. There is content in this video for all skill levels (beginners to experts). I originally recorded this video for the PyCon Conference. GitHub repo: https://github.com/KeithGalli/pycon2020 Patreon: https://www.patreon.com/keithgalli YT Membership: https://www.youtube.com/c/KGMIT/membership Some of the topics we cover: - Bag-of-words - Word vectors - Stemming/Lemmatization - Spell correction - Transformer Architecture (Attention is all you need) - State of the art models (OpenAI GPT, BERT) Some of the libraries used: - sklearn - spaCy - NLTK - TextBlob Hope you enjoy & let me know if you have any questions! Make sure to subscribe if you haven't already :). ------------------------- Follow me on social media! Instagram | https://www.instagram.com/keithgalli/ Twitter | https://twitter.com/keithgalli If you are curious to learn how I make my tutorials, check out this video: https://youtu.be/LEO4igyXbLs Practice your Python Pandas data science skills with problems on StrataScratch! https://stratascratch.com/?via=keith Join the Python Army to get access to perks! YouTube - https://www.youtube.com/channel/UCq6XkhO5SZ66N04IcPbqNcw/join Patreon - https://www.patreon.com/keithgalli *I use affiliate links on the products that I recommend. I may earn a purchase commission or a referral bonus from the usage of these links. ------------------------- Song at the end good morning by Amine Maxwell https://soundcloud.com/aminemaxwell Creative Commons — Attribution 3.0 Unported — CC BY 3.0 Free Download / Stream: http://bit.ly/2vpruoY Music promoted by Audio Library https://youtu.be/SQWFdnbzlgI ------------------------- Video Timeline! ~~ NLP Fundamentals ~~ 0:00 - Announcements! 1:12 - Video overview & timeline 3:06 - Bag of words (BOW) overview 4:42 - Bag of words example code! (sklearn | CountVectorizer, fit_transform) 11:20 - Building a text classification model using bag-of-words (SVM) 14:07 - Predicting new utterances classes using our model (transform) 16:02 - Unigram, bigram, ngrams (using consecutive words in your model) 19:28 - Word vectors overview 23:27 - Word vectors example code! (Using spaCy library) 28:10 - Building a text classification model using word vectors 34:04 - Predicting new utterances using our model ~~ Miscellaneous NLP Techniques ~~ 40:42 - Regexes (pattern matching) in Python. 52:30 - Stemming/Lemmatization in Python (text normalization w/ NLTK library) 1:01:17 - Stopwords Removal (removing most common words from sentences) 1:05:56 - Various other techniques (spell correction, sentiment analysis, part-of-speech tagging). ~~ State-of-the-art Models ~~ 1:12:45 - Recurrent Neural Networks (RNNs) for text classification 1:17:00 - Transformer architectures (attention is all you need) 1:21:00 - Writing Python code to leverage transformers (BERT | spacy-transformers) 1:25:00 - Writing a classification model using transformers/BERT 1:29:37 - Fine-tuning transformer models 1:31:16 - Bring it all together and build a high performance model to classify the categories of Amazon reviews! #Keith Galli #python #programming #python 3 #data science #data analysis #python programming #NLP #machine learning #ML #AI #artificial intelligence #natural language processing #hugging face #huggingface #pytorch #spell correction #stemming #lemmatization #openai gpt #gpt-2 #BERT #transformer architecture #attention is all you need #sklearn #scikit-learn #python3 #NLP in python #text analysis #text generation #state of the art #sota #data engineering #software development #data #datasets
2022年03月17日
00:00:00 - 01:37:46
【深層学習】Transformer - Multi-Head Attentionを理解してやろうじゃないの【ディープラーニングの世界vol.28】#106 #VRアカデミア #DeepLearning

【深層学習】Transformer - Multi-Head Attentionを理解してやろうじゃないの【ディープラーニングの世界vol.28】#106 #VRアカデミア #DeepLearning

Transformer のモデル構造とその数理を完全に解説しました。このレベルの解説は他にないんじゃないかってくらい話しました。 結局行列と内積しか使ってないんですよ。すごくないですか? ※行列の転値は、「左上に小文字の t 」という文化で生きています。 ☆お知らせ☆ AIcia Solid Project 公式HPが出来ました!!! https://sites.google.com/view/aicia-official/top HPでは私たちや動画コンテンツの紹介、板書データの公開などをしています。是非ご活用ください!! ▼関連動画 忙しい人向けはこちら → https://www.youtube.com/watch?v=FFoLqib6u-0 Multi-Head Attention は 15:27 から! Deep Learning の世界 https://www.youtube.com/playlist?list=PLhDAH9aTfnxKXf__soUoAEOrbLAOnVHCP 自然言語処理シリーズ https://www.youtube.com/playlist?list=PLhDAH9aTfnxL4XdCRjUCC0_flR00A6tJR ▼目次 公開後追加予定! ▼参考文献 Vaswani, Ashish, et al. "Attention is all you need." arXiv preprint arXiv:1706.03762 (2017). https://arxiv.org/abs/1706.03762 原論文! やや数式は難解ですが、この動画を見終わった後なら読めるはず! 当時の問題意識や、Transformerの売りどころがたくさん書いてあります。 (個AI的には、論文タイトルは、内容の要約であるべきだよなーと思います。意見や感想じゃなくて。) 【2019年版】自然言語処理の代表的なモデル・アルゴリズム時系列まとめ - Qiita https://qiita.com/LeftLetter/items/14b8f10b0ee98aa181b7 いろいろこれを参考にして動画を作っています ▼終わりに ご視聴ありがとうございました! 面白かったら高評価、チャンネル登録お願いします。 動画の質問、感想などは、コメント欄やTwitterにどうぞ! お仕事、コラボのご依頼は、TwitterのDMからお願い致します。 動画生成:AIcia Solid (Twitter: https://twitter.com/AIcia_Solid/ ) 動画編集:AIris Solid (妹) (Twitter: https://twitter.com/AIris_Solid/ ) ======= Logo: TEICAさん ( https://twitter.com/T_E_I_C_A ) Model: http://3d.nicovideo.jp/works/td44519 Model by: W01fa さん ( https://twitter.com/W01fa )
2021年07月02日
00:00:00 - 00:39:35
The Battle of Giants: Causal AI vs NLP

The Battle of Giants: Causal AI vs NLP

With over a dozen new papers accepted at NeurIPS 2023, causal inference has exploded in popularity, attracting a large amount of talent and interest from top researchers and institutions including industry giants like Amazon and Microsoft. Text data, with its high complexity, posits an exciting challenge for the causal inference community. In the presentation, we'll review the latest advances in Causal NLP and implement a causal Transformer model to demonstrate how to translate these developments into a practical solution that can bring real business value. All in Python! Key Takeaways: 1. Find out more about the more recent breakthroughs in the world of Causal NLP 2. Learn how to de-confound text using BERT, a neural network-based technique for language processing. 3. Understand the benefits, the applications, and the fundamentals of Causal AI
2024年04月18日
00:00:00 - 01:07:03
Computer Vision and Perception for Self-Driving Cars (Deep Learning Course)

Computer Vision and Perception for Self-Driving Cars (Deep Learning Course)

Learn about Computer Vision and Perception for Self Driving Cars. This series focuses on the different tasks that a Self Driving Car Perception unit would be required to do. ✏️ Course by Robotics with Sakshay. https://www.youtube.com/channel/UC57lEMTXZzXYu_y0FKdW6xA ⭐️ Course Contents and Links ⭐️ ⌨️ (0:00:00) Introduction ⌨️ (0:02:16) Fully Convolutional Network | Road Segmentation 🔗 Kaggle Dataset: https://www.kaggle.com/sakshaymahna/kittiroadsegmentation 🔗 Kaggle Notebook: https://www.kaggle.com/sakshaymahna/fully-convolutional-network 🔗 KITTI Dataset: http://www.cvlibs.net/datasets/kitti/ 🔗 Fully Convolutional Network Paper: https://arxiv.org/abs/1411.4038 🔗 Hand Crafted Road Segmentation: https://www.youtube.com/watch?v=hrin-qTn4L4 🔗 Deep Learning and CNNs: https://www.youtube.com/watch?v=aircAruvnKk ⌨️ (0:20:45) YOLO | 2D Object Detection 🔗 Kaggle Competition/Dataset: https://www.kaggle.com/c/3d-object-detection-for-autonomous-vehicles 🔗 Visualization Notebook: https://www.kaggle.com/sakshaymahna/lyft-3d-object-detection-eda 🔗 YOLO Notebook: https://www.kaggle.com/sakshaymahna/yolov3-keras-2d-object-detection 🔗 Playlist on Fundamentals of Object Detection: https://www.youtube.com/playlist?list=PL_IHmaMAvkVxdDOBRg2CbcJBq9SY7ZUvs 🔗 Blog on YOLO: https://www.section.io/engineering-education/introduction-to-yolo-algorithm-for-object-detection/ 🔗 YOLO Paper: https://arxiv.org/abs/1506.02640 ⌨️ (0:35:51) Deep SORT | Object Tracking 🔗 Dataset: https://www.kaggle.com/sakshaymahna/kittiroadsegmentation 🔗 Notebook/Code: https://www.kaggle.com/sakshaymahna/deepsort/notebook 🔗 Blog on Deep SORT: https://medium.com/analytics-vidhya/object-tracking-using-deepsort-in-tensorflow-2-ec013a2eeb4f 🔗 Deep SORT Paper: https://arxiv.org/abs/1703.07402 🔗 Kalman Filter: https://www.youtube.com/playlist?list=PLn8PRpmsu08pzi6EMiYnR-076Mh-q3tWr 🔗 Hungarian Algorithm: https://www.geeksforgeeks.org/hungarian-algorithm-assignment-problem-set-1-introduction/ 🔗 Cosine Distance Metric: https://www.machinelearningplus.com/nlp/cosine-similarity/ 🔗 Mahalanobis Distance: https://www.machinelearningplus.com/statistics/mahalanobis-distance/ 🔗 YOLO Algorithm: https://youtu.be/C3qmhPVUXiE ⌨️ (0:52:37) KITTI 3D Data Visualization | Homogenous Transformations 🔗 Dataset: https://www.kaggle.com/garymk/kitti-3d-object-detection-dataset 🔗 Notebook/Code: https://www.kaggle.com/sakshaymahna/lidar-data-visualization/notebook 🔗 LIDAR: https://geoslam.com/what-is-lidar/ 🔗 Tesla doesn't use LIDAR: https://towardsdatascience.com/why-tesla-wont-use-lidar-57c325ae2ed5 ⌨️ (1:06:45) Multi Task Attention Network (MTAN) | Multi Task Learning 🔗 Dataset: https://www.kaggle.com/sakshaymahna/cityscapes-depth-and-segmentation 🔗 Notebook/Code: https://www.kaggle.com/sakshaymahna/mtan-multi-task-attention-network 🔗 Data Visualization: https://www.kaggle.com/sakshaymahna/exploratory-data-analysis 🔗 MTAN Paper: https://arxiv.org/abs/1803.10704 🔗 Blog on Multi Task Learning: https://ruder.io/multi-task/ 🔗 Image Segmentation and FCN: https://youtu.be/U_v0Tovp4XQ ⌨️ (1:20:58) SFA 3D | 3D Object Detection 🔗 Dataset: https://www.kaggle.com/garymk/kitti-3d-object-detection-dataset 🔗 Notebook/Code: https://www.kaggle.com/sakshaymahna/sfa3d 🔗 Data Visualization: https://www.kaggle.com/sakshaymahna/l... 🔗 Data Visualization Video: https://youtu.be/tb1H42kE0eE 🔗 SFA3D GitHub Repository: https://github.com/maudzung/SFA3D 🔗 Feature Pyramid Networks: https://jonathan-hui.medium.com/understanding-feature-pyramid-networks-for-object-detection-fpn-45b227b9106c 🔗 Keypoint Feature Pyramid Network: https://arxiv.org/pdf/2001.03343.pdf 🔗 Heat Maps: https://en.wikipedia.org/wiki/Heat_map 🔗 Focal Loss: https://medium.com/visionwizard/understanding-focal-loss-a-quick-read-b914422913e7 🔗 L1 Loss: https://afteracademy.com/blog/what-are-l1-and-l2-loss-functions 🔗 Balanced L1 Loss: https://paperswithcode.com/method/balanced-l1-loss 🔗 Learning Rate Decay: https://medium.com/analytics-vidhya/learning-rate-decay-and-methods-in-deep-learning-2cee564f910b 🔗 Cosine Annealing: https://paperswithcode.com/method/cosine-annealing ⌨️ (1:40:24) UNetXST | Camera to Bird's Eye View 🔗 Dataset: https://www.kaggle.com/sakshaymahna/semantic-segmentation-bev 🔗 Dataset Visualization: https://www.kaggle.com/sakshaymahna/data-visualization 🔗 Notebook/Code: https://www.kaggle.com/sakshaymahna/unetxst 🔗 UNetXST Paper: https://arxiv.org/pdf/2005.04078.pdf 🔗 UNetXST Github Repository: https://github.com/ika-rwth-aachen/Cam2BEV 🔗 UNet: https://towardsdatascience.com/understanding-semantic-segmentation-with-unet-6be4f42d4b47 🔗 Image Transformations: https://kevinzakka.github.io/2017/01/10/stn-part1/ 🔗 Spatial Transformer Networks: https://kevinzakka.github.io/2017/01/18/stn-part2/
2022年01月27日
00:00:00 - 01:59:38
Stanford CS25: V1 I Mixture of Experts (MoE) paradigm and the Switch Transformer

Stanford CS25: V1 I Mixture of Experts (MoE) paradigm and the Switch Transformer

In deep learning, models typically reuse the same parameters for all inputs. Mixture of Experts (MoE) defies this and instead selects different parameters for each incoming example. The result is a sparsely-activated model -- with outrageous numbers of parameters -- but a constant computational cost. However, despite several notable successes of MoE, widespread adoption has been hindered by complexity, communication costs and training instability -- we address these with the Switch Transformer. We simplify the MoE routing algorithm and design intuitive improved models with reduced communication and computational costs. Our proposed training techniques help wrangle the instabilities and we show large sparse models may be trained, for the first time, with lower precision formats. We design models based off T5-Base and T5-Large to obtain up to 7x increases in pre-training speed with the same computational resources. These improvements extend into multilingual settings where we measure gains over the mT5-Base version across all 101 languages. Finally, we advance the current scale of language models by pre-training up to trillion parameter models on the "Colossal Clean Crawled Corpus" and achieve a 4x speedup over the T5-XXL model. Barret Zoph is a research scientist on the Google Brain team. He has worked on a variety of deep learning research topics ranging from neural architecture search (NAS), data augmentation, semi-supervised learning for computer vision and model sparsity. Prior to Google Brain he worked at the Information Sciences Institute working on machine translation. Irwan Bello is a research scientist on the Google Brain team. His research interests primarily lie in modeling, scaling and designing layers that process structured information while trading off scalability and inductive biases. View the entire CS25 Transformers United playlist: https://www.youtube.com/playlist?list=PLoROMvodv4rNiJRchCzutFw5ItR_Z27CM #Stanford #Stanford Online #MoE #Mixture of Experts #Switch Transformer
2022年07月15日
00:00:00 - 01:05:44
Stanford CS224N: NLP with Deep Learning | Winter 2019 | Lecture 14 – Transformers and Self-Attention

Stanford CS224N: NLP with Deep Learning | Winter 2019 | Lecture 14 – Transformers and Self-Attention

For more information about Stanford’s Artificial Intelligence professional and graduate programs, visit: https://stanford.io/3niIw41 Professor Christopher Manning, Stanford University, Ashish Vaswani & Anna Huang, Google http://onlinehub.stanford.edu/ Professor Christopher Manning Thomas M. Siebel Professor in Machine Learning, Professor of Linguistics and of Computer Science Director, Stanford Artificial Intelligence Laboratory (SAIL) To follow along with the course schedule and syllabus, visit: http://web.stanford.edu/class/cs224n/index.html#schedule 0:00 Introduction 2:07 Learning Representations of Variable Length Data 2:28 Recurrent Neural Networks 4:51 Convolutional Neural Networks? 14:06 Attention is Cheap! 16:05 Attention head: Who 16:26 Attention head: Did What? 16:35 Multihead Attention 17:34 Machine Translation: WMT-2014 BLEU 19:07 Frameworks 19:31 Importance of Residuals 23:26 Non-local Means 26:18 Image Transformer Layer 30:56 Raw representations in music and language 37:52 Attention: a weighted average 40:08 Closer look at relative attention 42:41 A Jazz sample from Music Transformer 44:42 Convolutions and Translational Equivariance 45:12 Relative positions Translational Equivariance 50:21 Sequential generation breaks modes. 50:32 Active Research Area #naturallanguageprocessing #deeplearning #Stanford #Stanford Online #NLP #Deep Learning #AI #CS224N
2019年03月22日
00:00:00 - 00:53:48
【深層学習】Attention - 全領域に応用され最高精度を叩き出す注意機構の仕組み【ディープラーニングの世界 vol. 24】#095 #VRアカデミア #DeepLearning

【深層学習】Attention - 全領域に応用され最高精度を叩き出す注意機構の仕組み【ディープラーニングの世界 vol. 24】#095 #VRアカデミア #DeepLearning

▼テーマ Transformer や BERT で爆発的な利用を獲得した Attention の祖となるネットワーク RNNsearch について解説します。 Attention は自然言語で GPT-3 の化け物的な精度を出したのみならず、画像や生成モデルなど、超広い領域に応用されています。 今の Deep Learning を語る上では外せない要素! 要チェック! ▼関連プレイリスト Deep Learning の世界 https://www.youtube.com/playlist?list=PLhDAH9aTfnxKXf__soUoAEOrbLAOnVHCP 自然言語処理シリーズ https://www.youtube.com/playlist?list=PLhDAH9aTfnxL4XdCRjUCC0_flR00A6tJR ▼目次 (後で追加します。暫くお待ちください) ▼参考文献 Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and translate." arXiv preprint arXiv:1409.0473 (2014). https://arxiv.org/abs/1409.0473 原論文です! 当時の歴史も含めて、過度に難解でない感じで書いてあるので、読んでみてもいいかも! 【2019年版】自然言語処理の代表的なモデル・アルゴリズム時系列まとめ - Qiita https://qiita.com/LeftLetter/items/14b8f10b0ee98aa181b7 いろいろこれを参考にして動画を作っています ▼参考動画 RNN の動画 → https://www.youtube.com/watch?v=NJdrYvYgaPM&list=PLhDAH9aTfnxKXf__soUoAEOrbLAOnVHCP&index=8 GRU の動画 → https://www.youtube.com/watch?v=K8ktkhAEuLM&list=PLhDAH9aTfnxKXf__soUoAEOrbLAOnVHCP&index=10 RNN の3つの使い方(BiGRU のところよくわからなかった人向け) → https://www.youtube.com/watch?v=IcCIu5Gx6uA&list=PLhDAH9aTfnxKXf__soUoAEOrbLAOnVHCP&index=9 Bi-LSTM の動画( Bi-GRU の仲間) → https://www.youtube.com/watch?v=O1PCh_aaprE&list=PLhDAH9aTfnxKXf__soUoAEOrbLAOnVHCP&index=12 ▼終わりに ご視聴ありがとうございました! 面白かったら高評価、チャンネル登録お願いします。 動画の質問、感想などは、コメント欄やTwitterにどうぞ! お仕事、コラボのご依頼は、TwitterのDMからお願い致します。 動画生成:AIcia Solid (Twitter: https://twitter.com/AIcia_Solid/ ) 動画編集:AIris Solid (妹) (Twitter: https://twitter.com/AIris_Solid/ ) ======= Logo: TEICAさん ( https://twitter.com/T_E_I_C_A ) Model: http://3d.nicovideo.jp/works/td44519 Model by: W01fa さん ( https://twitter.com/W01fa )
2021年03月26日
00:00:00 - 00:36:37
Stanford CS25: V3 I Beyond LLMs: Agents, Emergent Abilities, Intermediate-Guided Reasoning, BabyLM

Stanford CS25: V3 I Beyond LLMs: Agents, Emergent Abilities, Intermediate-Guided Reasoning, BabyLM

November 28, 2023 Steven Feng, Stanford University Div Garg, Stanford University Karan Singh, Stanford University In this talk, we will explore cutting-edge topics in the realm of AI, particularly focusing on going beyond a single monolithic Large Language Model (LLM) to Autonomous Agentic AI Systems, as well as discussing the emergent abilities of LLMs as they scale up. Further, there is discussion about different approaches for LLM intermediate-guided reasoning: methods of breaking down the reasoning process for text generation to arrive at a final answer (e.g. into a series of steps, such as chain-of-thought). Additionally, the talk will delve into a concept known as BabyLM, aimed at creating small yet highly efficient language models that can learn on similar amounts of training data as human children. This talk will not only highlight the technical aspects of these developments but also discuss the ethical implications and future prospects of AI in our increasingly digital world. More about the course can be found here: https://web.stanford.edu/class/cs25/ View the entire CS25 Transformers United playlist: https://www.youtube.com/playlist?list=PLoROMvodv4rNiJRchCzutFw5ItR_Z27CM #Stanford #Stanford Online
2023年12月16日
00:00:00 - 01:00:14
Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

Stanford CS25: V1 I Transformers in Language: The development of GPT Models, GPT3

While the Transformer architecture is used in a variety of applications across a number of domains, it first found success in natural language. Today, Transformers remain the de facto model in language - they achieve state-of-the-art results on most natural language benchmarks, and can generate text coherent enough to deceive human readers. In this talk, we will review recent progress in neural language modeling, discuss the link between generating text and solving downstream tasks, and explore how this led to the development of GPT models at OpenAI. Next, we’ll see how the same approach can be used to produce generative models and strong representations in other domains like images, text-to-image, and code. Finally, we will dive into the recently released code generating model, Codex, and examine this particularly interesting domain of study. Mark Chen is a research scientist at OpenAI, where he manages the Algorithms Team. His research interests include generative modeling and representation learning, especially in the image and multimodal domains. Prior to OpenAI, Mark worked in high frequency trading and graduated from MIT. Mark is also a coach for the USA Computing Olympiad team. View the entire CS25 Transformers United playlist: https://www.youtube.com/playlist?list=PLoROMvodv4rNiJRchCzutFw5ItR_Z27CM 0:00 Introduction 0:08 3-Gram Model (Shannon 1951) 0:27 Recurrent Neural Nets (Sutskever et al 2011) 1:12 Big LSTM (Jozefowicz et al 2016) 1:52 Transformer (Llu and Saleh et al 2018) 2:33 GPT-2: Big Transformer (Radford et al 2019) 3:38 GPT-3: Very Big Transformer (Brown et al 2019) 5:12 GPT-3: Can Humans Detect Generated News Articles? 9:09 Why Unsupervised Learning? 10:38 Is there a Big Trove of Unlabeled Data? 11:11 Why Use Autoregressive Generative Models for Unsupervised Learnin 13:00 Unsupervised Sentiment Neuron (Radford et al 2017) 14:11 Radford et al 2018) 15:21 Zero-Shot Reading Comprehension 16:44 GPT-2: Zero-Shot Translation 18:15 Language Model Metalearning 19:23 GPT-3: Few Shot Arithmetic 20:14 GPT-3: Few Shot Word Unscrambling 20:36 GPT-3: General Few Shot Learning 23:42 IGPT (Chen et al 2020): Can we apply GPT to images? 25:31 IGPT: Completions 26:24 IGPT: Feature Learning 32:20 Isn't Code Just Another Modality? 33:33 The HumanEval Dataset 36:00 The Pass @ K Metric 36:59 Codex: Training Details 38:03 An Easy Human Eval Problem (pass@1 -0.9) 38:36 A Medium HumanEval Problem (pass@1 -0.17) 39:00 A Hard HumanEval Problem (pass@1 -0.005) 41:26 Calibrating Sampling Temperature for Pass@k 42:19 The Unreasonable Effectiveness of Sampling 43:17 Can We Approximate Sampling Against an Oracle? 45:52 Main Figure 46:53 Limitations 47:38 Conclusion 48:19 Acknowledgements #gpt3 #Stanford #GPT Models #GPT3 #OpenAI #Stanford Online #Transformers #AI #Artificial intelligence
2022年07月12日
00:00:00 - 00:48:39
Stanford CS25: V3 I Low-level Embodied Intelligence w/ Foundation Models

Stanford CS25: V3 I Low-level Embodied Intelligence w/ Foundation Models

October 10, 2023 Low-level Embodied Intelligence with Foundation Models Fei Xia, Google DeepMind This talk introduces two novel approaches to low-level embodied intelligence through integrating large language models (LLMs) with robotics, focusing on "Language to Reward" and "Robotics Transformer-2". The former employs LLMs to generate reward code, creating a bridge between high-level language instructions and low-level robotic actions. This method allows for real-time user interaction, efficiently controlling robotic arms for various tasks and outperforming baseline methodologies. "Robotics Transformer-2" integrates advanced vision-language models with robotic control by co-fine-tuning on robotic trajectory data and extensive web-based vision-language tasks, resulting in the robust RT-2 model which exhibits strong generalization capabilities. This approach allows robots to execute untrained commands and efficiently perform multi-stage semantic reasoning tasks, exemplifying significant advancements in contextual understanding and response to user commands. These projects demonstrate that language models can extend beyond their conventional domain of high-level reasoning tasks, playing a crucial role not only in interpreting and generating instructions but also in the nuanced generation of low-level robotic actions. More about the course can be found here: https://web.stanford.edu/class/cs25/ View the entire CS25 Transformers United playlist: https://www.youtube.com/playlist?list=PLoROMvodv4rNiJRchCzutFw5ItR_Z27CM #Stanford #Stanford Online
2023年12月09日
00:00:00 - 01:18:14
Stanford CS25: V1 I Transformers in Vision: Tackling problems in Computer Vision

Stanford CS25: V1 I Transformers in Vision: Tackling problems in Computer Vision

In this talk, Lucas discusses some of the ways transformers have been applied to problems in Computer Vision. Lucas Beyer grew up in Belgium wanting to make video games and their AI, went on to study mechanical engineering at RWTH Aachen in Germany, did a PhD in robotic perception/computer vision there too, and is now researching representation learning at Google Brain in Zürich. View the entire CS25 Transformers United playlist: https://www.youtube.com/playlist?list=PLoROMvodv4rNiJRchCzutFw5ItR_Z27CM #computervision #Stanford #Stanford Online #Google Brain #Computer Vision #AI #Artificial Intelligence
2022年07月13日
00:00:00 - 01:08:37
Deep Learning入門:Attention(注意)

Deep Learning入門:Attention(注意)

Deep LearningにおいてConvolutional Neural Networksに並んで大変ポピュラーに用いられつつあるニューラルネットワークの基本的な構造、Attention(注意)について解説します。 前回の動画:「量子化によるニューラルネットワークのコンパクト化」 https://www.youtube.com/watch?v=qpd9I8m1bOA Deep Learning入門:ニューラルネットワーク設計の基礎 https://www.youtube.com/watch?v=O3qm6qZooP0 Deep Learning入門:Recurrent Neural Networksとは https://www.youtube.com/watch?v=yvqgQZIUAKg Deep Learning入門:数式なしで理解するLSTM(Long short-term memory) https://www.youtube.com/watch?v=unE_hofrYrk 再生リスト「Deep Learning入門」 https://www.youtube.com/playlist?list=PLg1wtJlhfh23pjdFv4p8kOBYyTRvzseZ3 再生リスト「実践Deep Learning」 https://www.youtube.com/playlist?list=PLg1wtJlhfh20zNXqPYhQXU6-m5SoN-4Eu 再生リスト「Deep Learning 精度向上テクニック」 https://www.youtube.com/playlist?list=PLg1wtJlhfh216rnmSv_oEDuchRjgUqxBi Neural Network Console https://dl.sony.com/ja/ Neural Network Libraries https://nnabla.org/ja/ Squeeze-and-Excitation Networks Jie Hu, Li Shen, Samuel Albanie, Gang Sun, Enhua Wu https://arxiv.org/abs/1709.01507 Attention Is All You Need Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin https://arxiv.org/abs/1706.03762 BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova https://arxiv.org/abs/1810.04805 #Deep Learning #Neural Network #Neural Network Console #Neural Network Libraries #Sony #AI #深層学習 #ディープラーニング #ニュールネットワーク #ソニー #人工知能
2020年01月23日
00:00:00 - 00:15:39
Stanford CS25: V1 I Decision Transformer: Reinforcement Learning via Sequence Modeling

Stanford CS25: V1 I Decision Transformer: Reinforcement Learning via Sequence Modeling

We introduce a framework that abstracts Reinforcement Learning (RL) as a sequence modeling problem. This allows us to draw upon the simplicity and scalability of the Transformer architecture, and associated advances in language modeling such as GPT-x and BERT. In particular, we present Decision Transformer, an architecture that casts the problem of RL as conditional sequence modeling. Unlike prior approaches to RL that fit value functions or compute policy gradients, Decision Transformer simply outputs the optimal actions by leveraging a causally masked Transformer. By conditioning an autoregressive model on the desired return (reward), past states, and actions, our Decision Transformer model can generate future actions that achieve the desired return. Despite its simplicity, Decision Transformer matches or exceeds the performance of state-of-the-art model-free offline RL baselines on Atari, OpenAI Gym, and Key-to-Door tasks. Aditya Grover is a research scientist in the Core ML team at Facebook AI Research, a visiting postdoctoral researcher at UC Berkeley, and an incoming assistant professor of computer science at UCLA. His research centers around foundations of probabilistic machine learning for unsupervised representation learning and sequential decision making, and is grounded in applications at the intersection of physical sciences and climate change. His research has been recognized with a best paper award, several research fellowships (Microsoft Research, Lieberman, Google-Simons-Berkeley, Adobe), a best undergraduate thesis award, and the ACM SIGKDD doctoral dissertation award. He is also a recipient of the Gores Award -- Stanford's highest university-level distinction in teaching for faculty and students. Aditya received his PhD from Stanford and bachelors from IIT Delhi, both in computer science. View the entire CS25 Transformers United playlist: https://www.youtube.com/playlist?list=PLoROMvodv4rNiJRchCzutFw5ItR_Z27CM #reinforcementlearning #Stanford #Stanford Online #Artificial Intelligence #AI #Reinforcement Learning #Sequence Modeling
2022年07月14日
00:00:00 - 01:20:43
Stanford CS25: V2 I Robotics and Imitation Learning

Stanford CS25: V2 I Robotics and Imitation Learning

February 7, 2023 Robotics and Imitation Learning Ted Xiao In this speaker series, we examine the details of how transformers work, and dive deep into the different kinds of transformers and how they're applied in different fields. We do this by inviting people at the forefront of transformers research across different domains for guest lectures. More about the course can be found here: https://web.stanford.edu/class/cs25/ View the entire CS25 Transformers United playlist: https://www.youtube.com/playlist?list=PLoROMvodv4rNiJRchCzutFw5ItR_Z27CM #Stanford #Stanford Online
2023年05月24日
00:00:00 - 01:16:08
【深層学習】GPT - 伝説の始まり。事前学習とファインチューニングによるパラダイムシフト【ディープラーニングの世界vol.31】#109 #VRアカデミア #DeepLearning

【深層学習】GPT - 伝説の始まり。事前学習とファインチューニングによるパラダイムシフト【ディープラーニングの世界vol.31】#109 #VRアカデミア #DeepLearning

GPT-2, GPT-3,... へと続いていく GPT シリーズの1つめです。 事前学習とファインチューニングのパラダイムを決定づけた研究の1つだと思います! ☆お知らせ☆ AIcia Solid Project 公式HPが出来ました!!! https://sites.google.com/view/aicia-official/top HPでは私たちや動画コンテンツの紹介、板書データの公開などをしています。是非ご活用ください!! ▼関連動画 Transformer の動画はこちら! https://www.youtube.com/watch?v=50XvMaWhiTY 忙しい人向け → https://www.youtube.com/watch?v=FFoLqib6u-0 Deep Learning の世界 https://www.youtube.com/playlist?list=PLhDAH9aTfnxKXf__soUoAEOrbLAOnVHCP 自然言語処理シリーズ https://www.youtube.com/playlist?list=PLhDAH9aTfnxL4XdCRjUCC0_flR00A6tJR ▼目次 公開後追加予定! ▼参考文献 Radford, Alec, et al. "Improving language understanding by generative pre-training." (2018). https://www.cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf 原論文! 研究者がまだふつーの反応をしていた頃。ある意味貴重な時代! 【2019年版】自然言語処理の代表的なモデル・アルゴリズム時系列まとめ - Qiita https://qiita.com/LeftLetter/items/14b8f10b0ee98aa181b7 いろいろこれを参考にして動画を作っています ▼終わりに ご視聴ありがとうございました! 面白かったら高評価、チャンネル登録お願いします。 動画の質問、感想などは、コメント欄やTwitterにどうぞ! お仕事、コラボのご依頼は、TwitterのDMからお願い致します。 動画生成:AIcia Solid (Twitter: https://twitter.com/AIcia_Solid/ ) 動画編集:AIris Solid (妹) (Twitter: https://twitter.com/AIris_Solid/ ) ======= Logo: TEICAさん ( https://twitter.com/T_E_I_C_A ) Model: http://3d.nicovideo.jp/works/td44519 Model by: W01fa さん ( https://twitter.com/W01fa )
2021年07月24日
00:00:00 - 00:25:30
【深層学習】忙しい人のための Transformer と Multi-Head Attention【ディープラーニングの世界 vol.29 】#107 #VRアカデミア #DeepLearning

【深層学習】忙しい人のための Transformer と Multi-Head Attention【ディープラーニングの世界 vol.29 】#107 #VRアカデミア #DeepLearning

忙しくない人向けの完全版はこちら! → https://youtube.com/watch?v=50XvMaWhiTY Transformer とその基幹である Multi-Head Attention を、忙しい人向けに結構なスピード感で解説しました。10分以内部門(?)では最も深く(分かりやすく?)数式の解説をしたものだと思います! (この動画は、完全版の復習に使うのが丁度いいかもしれません) ※行列の転値は、「左上に小文字の t 」という文化で生きています。 ☆お知らせ☆ AIcia Solid Project 公式HPが出来ました!!! https://sites.google.com/view/aicia-official/top HPでは私たちや動画コンテンツの紹介、板書データの公開などをしています。是非ご活用ください!! ▼関連動画 Deep Learning の世界 https://www.youtube.com/playlist?list=PLhDAH9aTfnxKXf__soUoAEOrbLAOnVHCP 自然言語処理シリーズ https://www.youtube.com/playlist?list=PLhDAH9aTfnxL4XdCRjUCC0_flR00A6tJR ▼目次 公開後追加予定! ▼参考文献 Vaswani, Ashish, et al. "Attention is all you need." arXiv preprint arXiv:1706.03762 (2017). https://arxiv.org/abs/1706.03762 原論文! やや数式は難解ですが、この動画を見終わった後なら読めるはず! 当時の問題意識や、Transformerの売りどころがたくさん書いてあります。 (個AI的には、論文タイトルは、内容の要約であるべきだよなーと思います。意見や感想じゃなくて。) 【2019年版】自然言語処理の代表的なモデル・アルゴリズム時系列まとめ - Qiita https://qiita.com/LeftLetter/items/14b8f10b0ee98aa181b7 いろいろこれを参考にして動画を作っています ▼終わりに ご視聴ありがとうございました! 面白かったら高評価、チャンネル登録お願いします。 動画の質問、感想などは、コメント欄やTwitterにどうぞ! お仕事、コラボのご依頼は、TwitterのDMからお願い致します。 動画生成:AIcia Solid (Twitter: https://twitter.com/AIcia_Solid/ ) 動画編集:AIris Solid (妹) (Twitter: https://twitter.com/AIris_Solid/ ) ======= Logo: TEICAさん ( https://twitter.com/T_E_I_C_A ) Model: http://3d.nicovideo.jp/works/td44519 Model by: W01fa さん ( https://twitter.com/W01fa )
2021年07月09日
00:00:00 - 00:06:58
Stanford CS224N NLP with Deep Learning | Winter 2021 | Lecture 9 - Self- Attention and Transformers

Stanford CS224N NLP with Deep Learning | Winter 2021 | Lecture 9 - Self- Attention and Transformers

For more information about Stanford's Artificial Intelligence professional and graduate programs visit: https://stanford.io/3CvTOGY This lecture covers: 1. Impact of Transformers on NLP (and ML more broadly) 2. From Recurrence (RNNs) to Attention-Based NLP Models 3. Understanding the Transformer Model 4. Drawbacks and Variants of Transformers To learn more about this course visit: https://online.stanford.edu/courses/cs224n-natural-language-processing-deep-learning To follow along with the course schedule and syllabus visit: http://web.stanford.edu/class/cs224n/ John Hewitt PhD student in Computer Science at Stanford University Professor Christopher Manning Thomas M. Siebel Professor in Machine Learning, Professor of Linguistics and of Computer Science Director, Stanford Artificial Intelligence Laboratory (SAIL) #deeplearning #naturallanguageprocessing #Stanford #NLP #AI #Deep Learning #CS224N
2021年10月29日
00:00:00 - 01:16:57
Generative Python Transformer p.2 - Raw Data Cleaning

Generative Python Transformer p.2 - Raw Data Cleaning

Removing non python files for our generative python transformer model's training data. Neural Networks from Scratch book: https://nnfs.io Channel membership: https://www.youtube.com/channel/UCfzlCWGWYyIQ0aLC5w48gBQ/join Discord: https://discord.gg/sentdex Reddit: https://www.reddit.com/r/sentdex/ Support the content: https://pythonprogramming.net/support-donate/ Twitter: https://twitter.com/sentdex Instagram: https://instagram.com/sentdex Facebook: https://www.facebook.com/pythonprogramming.net/ Twitch: https://www.twitch.tv/sentdex #python #programming
2021年05月08日
00:00:00 - 00:24:41
Stanford CS25: V3 I How I Learned to Stop Worrying and Love the Transformer

Stanford CS25: V3 I How I Learned to Stop Worrying and Love the Transformer

November 7, 2023 Ashish Vaswani Ashish will present the motivations behind the Transformer and how it's evolved over the years. He will conclude with a few useful research directions. Ashish Vaswani is a computer scientist working in deep learning, who is known for his significant contributions to the field of artificial intelligence (AI) and natural language processing (NLP). He is one of the co-authors of the seminal paper "Attention is All You Need" which introduced the Transformer model. He was also a co-founder of Adept AI Labs and a former staff research scientist at Google Brain. More about the course can be found here: https://web.stanford.edu/class/cs25/ View the entire CS25 Transformers United playlist: https://www.youtube.com/playlist?list=PLoROMvodv4rNiJRchCzutFw5ItR_Z27CM #Stanford #Stanford Online
2024年01月18日
00:00:00 - 01:20:38
How do I encode categorical features using scikit-learn?

How do I encode categorical features using scikit-learn?

In order to include categorical features in your Machine Learning model, you have to encode them numerically using "dummy" or "one-hot" encoding. But how do you do this correctly using scikit-learn? In this video, you'll learn how to use OneHotEncoder and ColumnTransformer to encode your categorical features and prepare your feature matrix in a single step. You'll also learn how to include this step within a Pipeline so that you can cross-validate your model and preprocessing steps simultaneously. Finally, you'll learn why you should use scikit-learn (rather than pandas) for preprocessing your dataset. AGENDA: 0:00 Introduction 0:22 Why should you use a Pipeline? 2:30 Preview of the lesson 3:35 Loading and preparing a dataset 6:11 Cross-validating a simple model 10:00 Encoding categorical features with OneHotEncoder 15:01 Selecting columns for preprocessing with ColumnTransformer 19:00 Creating a two-step Pipeline 19:54 Cross-validating a Pipeline 21:44 Making predictions on new data 23:43 Recap of the lesson 24:50 Why should you use scikit-learn (rather than pandas) for preprocessing? CODE FROM THIS VIDEO: https://github.com/justmarkham/scikit-learn-videos/blob/master/10_categorical_features.ipynb WANT TO JOIN MY NEXT LIVE WEBCAST? Become a member ($5/month): https://www.patreon.com/dataschool === RELATED RESOURCES === OneHotEncoder documentation: https://scikit-learn.org/stable/modules/preprocessing.html#preprocessing-categorical-features ColumnTransformer documentation: https://scikit-learn.org/stable/modules/compose.html#columntransformer-for-heterogeneous-data Pipeline documentation: https://scikit-learn.org/stable/modules/compose.html#pipeline My video on cross-validation: https://www.youtube.com/watch?v=6dbrR-WymjI&list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A&index=7 My video on grid search: https://www.youtube.com/watch?v=Gol_qOgRqfA&list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A&index=8 My lesson notebook on StandardScaler: https://nbviewer.jupyter.org/github/justmarkham/DAT8/blob/master/notebooks/19_advanced_sklearn.ipynb === WANT TO GET BETTER AT MACHINE LEARNING? === 1) WATCH my scikit-learn video series: https://www.youtube.com/playlist?list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A 2) SUBSCRIBE for more videos: https://www.youtube.com/dataschool?sub_confirmation=1 3) ENROLL in my Machine Learning course: https://www.dataschool.io/learn/ 4) LET'S CONNECT! - Newsletter: https://www.dataschool.io/subscribe/ - Twitter: https://twitter.com/justmarkham - Facebook: https://www.facebook.com/DataScienceSchool/ - LinkedIn: https://www.linkedin.com/in/justmarkham/ #python #data science #machine learning #scikit-learn
2019年11月13日
00:00:00 - 00:27:59
Stanford CS25: V1 I Transformer Circuits, Induction Heads, In-Context Learning

Stanford CS25: V1 I Transformer Circuits, Induction Heads, In-Context Learning

"Neural network parameters can be thought of as compiled computer programs. Somehow, they encode sophisticated algorithms, capable of things no human knows how to write a computer program to do. Mechanistic interpretability seeks to reverse engineer neural networks into human understandable algorithms. Previous work has tended to focus on vision models; this talk will explore how we might reverse engineer transformer language models.  In particular, we'll focus on what we call ""induction head circuits"", a mechanism that appears to be significantly responsible for in-context learning. Using a pair of attention heads, these circuits allow models to repeat text from earlier in the context, translate text seen earlier, mimic functions from examples earlier in the context, and much more. The discovery of induction heads in the learning process appears to drive a sharp phase change, creating a bump in the loss curve, pivoting models learning trajectories, and greatly increasing their capacity for in-context learning, in the span of just a few hundred training steps." Chris Olah is a co-founder of Anthropic, an AI company focused on the safety of large models, where he leads Anthropic's interpretability efforts. Previously, Chris led OpenAI's interpretability team, and was a researcher at Google Brain. Chris' work includes the Circuits project, his blog (especially his tutorial on LSTMs), the Distill journal, and DeepDream. View the entire CS25 Transformers United playlist: https://www.youtube.com/playlist?list=PLoROMvodv4rNiJRchCzutFw5ItR_Z27CM #Stanford #Stanford Online #transformer #AI #Artificial Intelligence #induction heads
2022年07月18日
00:00:00 - 00:59:34