Deep Q Learning for Video Games - The Math of Intelligence #9

We're going to replicate DeepMind's Deep Q Learning algorithm for Super Mario Bros! This bot will be able to play a bunch of different video games by using reinforcement learning. This is the first video in this series that uses libraries (Keras & Gym) because if it didn't, the code would be way too long for a short video. I'll make a longer, in-depth version without libraries soon.

Code for this video:
https://github.com/llSourcell/deep_q_learning

Please Subscribe! And like. And comment. That's what keeps me going.

More learning resources:
https://medium.com/emergent-future/simple-reinforcement-learning-with-tensorflow-part-0-q-learning-with-tables-and-neural-networks-d195264329d0
http://pytorch.org/tutorials/intermediate/reinforcement_q_learning.html
http://neuro.cs.ut.ee/demystifying-deep-reinforcement-learning/
http://karpathy.github.io/2016/05/31/rl/
https://yanpanlau.github.io/2016/07/10/FlappyBird-Keras.html
https://keon.io/deep-q-learning/
http://www0.cs.ucl.ac.uk/staff/d.silver/web/Resources_files/deep_rl.pdf
http://mnemstudio.org/path-finding-q-learning-tutorial.htm

Join us in the Wizards Slack channel:
http://wizards.herokuapp.com/

And please support me on Patreon:
https://www.patreon.com/user?u=3191693
Follow me:
Twitter: https://twitter.com/sirajraval
Facebook: https://www.facebook.com/sirajology Instagram: https://www.instagram.com/sirajraval/
Signup for my newsletter for exciting updates in the field of AI:
https://goo.gl/FZzJ5w
Hit the Join button above to sign up to become a member of my channel for access to exclusive content! Join my AI community: http://chatgptschool.io/ Sign up for my AI Sports betting Bot, WagerGPT! (500 spots available):
https://www.wagergpt.co

#deep q learning atari #deep q-learning #deep q network #deep q learning #deep q #deep q-learning algorithm #deep q-learning tutorial #deep learning game #q learning #deep q learning python #deep q learning tutorial #q-learning #deep reinforcement learning #deep q-learning with recurrent neural networks

it's damn hilarious. Keep going!

2017年08月12日　

00:00:53 - 00:09:47

That picture of Siraj petting a lama at should be the cover of his mixtape, unsupervised learner great video Siraj!

Deep Q Learning for Video Games - The Math of Intelligence #9

2017年08月12日　

00:01:19 - 00:09:47

atyou say the more in the fuure the reward is - more are we uncertain of it? i didn't get it-can you explain with an example ?

Deep Q Learning for Video Games - The Math of Intelligence #9

2017年08月12日　

00:05:15 - 00:09:47

Well I don't think the pooling layer is used to get insensitive about the locations of the objects in an image. The convolutional layer can already do that since the convolutional operation is actually a pixel window going from location to location until all locations are considered under the set stride. The pooling layer is used to semantically merge similar features into one, like in the max pooling example used in this video, you can see the image is partitioned into 4 parts and in each part, the max number is preserved. The max number can semantically represent a feature in that region. It's more like image compression but we have preserved the key features of this object in this image. Feeding this pooled image into the neural net could be more efficient.

Deep Q Learning for Video Games - The Math of Intelligence #9

2017年08月12日　

00:07:46 - 00:09:47

At whats the input_shape supposed to be ?? the challenge code and what you show are different ......

Deep Q Learning for Video Games - The Math of Intelligence #9

2017年08月12日　

00:08:08 - 00:09:47

I understand that a convolutional neural network can be used to simplify the state from an array of pixels to a smaller collection of values, but how does the algorithm use a deep network to approximate the Q-function?

Deep Q Learning for Video Games - The Math of Intelligence #9

2017年08月12日　

00:08:19 - 00:09:47

and it's only long? Autolike from me :)

Deep Q Learning for Video Games - The Math of Intelligence #9

2017年08月12日　

00:09:46 - 00:09:47

チャンネル登録

Siraj Raval

※本サイトに掲載されているチャンネル情報や動画情報はYouTube公式のAPIを使って取得・表示しています。

概要カレンダー動画一覧タイムテーブルチャンネル分析

Timetable

動画タイムテーブル

よく話題になっている単語を表示する

動画数：471件

字幕を含める

@ you are showing what appears to be yet another medical dataset "medalpaca/medical_meadow_mediqa" but it is unclear how that is used.

DoctorGPT: Offline & Passes Medical Exams!

2023年08月13日　 Mark Woodworth 様　

00:13:12 - 00:18:13

@What exactly are you concatenating? You say "instruction column and input column into a single input" but the code references only the "question" column from the "GBaker/MedQA-USMLE-4-options" dataset. The question is then submitted for inference as-is, without being combined with anything as far as I can tell. Also - are the options (answer choices) and correct_answer_idx (multiple choice answer) used anywhere?

DoctorGPT: Offline & Passes Medical Exams!

2023年08月13日　 Mark Woodworth 様　

00:13:21 - 00:13:12

@ you mention SFT with the base model, but the code appears to be using the chat model

DoctorGPT: Offline & Passes Medical Exams!

2023年08月13日　 Mark Woodworth 様　

00:18:13 - 00:38:49

Deep Q Learning for Video Games - The Math of Intelligence #9

it's damn hilarious. Keep going!

That picture of Siraj petting a lama at should be the cover of his mixtape, unsupervised learner great video Siraj!

atyou say the more in the fuure the reward is - more are we uncertain of it? i didn't get it-can you explain with an example ?

At whats the input_shape supposed to be ?? the challenge code and what you show are different ......

I understand that a convolutional neural network can be used to simplify the state from an array of pixels to a smaller collection of values, but how does the algorithm use a deep network to approximate the Q-function?

and it's only long? Autolike from me :)

Siraj Raval

Timetable

よく話題になっている単語

.

that’s an outdated player roster from 3 years ago buddy

fwiw they went 1-3 on those picks

😂

the lie. any apple m* processor has it, even many phone processor have

@ you are showing what appears to be yet another medical dataset "medalpaca/medical_meadow_mediqa" but it is unclear how that is used.

@ you mention SFT with the base model, but the code appears to be using the chat model

機械学習のまとめとは

利用規約

プライバシーポリシー

お問い合わせ

その他のデータベース