- Backpropagation in 5 Minutes (tutorial)

Backpropagation in 5 Minutes (tutorial)

Let's discuss the math behind back-propagation. We'll go over the 3 terms from Calculus you need to understand it (derivatives, partial derivatives, and the chain rule and implement it programmatically.

Code for this video:
https://github.com/llSourcell/how_to_do_math_for_deep_learning

Please S...
Let's discuss the math behind back-propagation. We'll go over the 3 terms from Calculus you need to understand it (derivatives, partial derivatives, and the chain rule and implement it programmatically.

Code for this video:
https://github.com/llSourcell/how_to_do_math_for_deep_learning

Please Subscribe! And like. And comment. That's what keeps me going.

I've used this code in a previous video. I had to keep the code as simple as possible in order to add on these mathematical explanations and keep it at around 5 minutes.

More Learning resources:
https://mihaiv.wordpress.com/2010/02/08/backpropagation-algorithm/
http://outlace.com/Computational-Graph/
http://briandolhansky.com/blog/2013/9/27/artificial-neural-networks-backpropagation-part-4
https://jeremykun.com/2012/12/09/neural-networks-and-backpropagation/
https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/

Join us in the Wizards Slack channel:
http://wizards.herokuapp.com/

And please support me on Patreon:
https://www.patreon.com/user?u=3191693

Forgot to add my patron shoutout at the end so special thanks to Patrons Tim Jiang, HG Oh, Hoang, Advait Shinde, Vijay Daniel & Umesh Rangasamy
Follow me:
Twitter:
Facebook: https://www.facebook.com/sirajology Instagram: https://www.instagram.com/sirajraval/ Instagram: https://www.instagram.com/sirajraval/
Signup for my newsletter for exciting updates in the field of AI:
https://goo.gl/FZzJ5w
Hit the Join button above to sign up to become a member of my channel for access to exclusive content! Join my AI community: http://chatgptschool.io/ Sign up for my AI Sports betting Bot, WagerGPT! (500 spots available):
https://www.wagergpt.co

#backpropagation #back propagation #backpropagation example #back propagation neural network #backpropagation in neural networks #backpropagation algorithm #back propagation algorithm in neural network #neural network backpropagation #backpropagation explained #back propagation algorithm
Why is the seed not 42 as it is supposed to be!? - Backpropagation in 5 Minutes (tutorial)

Why is the seed not 42 as it is supposed to be!?

Backpropagation in 5 Minutes (tutorial)
2017年04月03日 
00:00:22 - 00:05:29
at  , shouldn't there be 3 Inputs? one for each of the features including the bias?If I understand correctly, number of input layer neurons is equal to number of features in our data.Am i correct? - Backpropagation in 5 Minutes (tutorial)

at , shouldn't there be 3 Inputs? one for each of the features including the bias?If I understand correctly, number of input layer neurons is equal to number of features in our data.Am i correct?

Backpropagation in 5 Minutes (tutorial)
2017年04月03日 
00:00:38 - 00:05:29
for o =for i = 1:n - Backpropagation in 5 Minutes (tutorial)

for o =for i = 1:n

Backpropagation in 5 Minutes (tutorial)
2017年04月03日 
00:01:02 - 00:02:40
I'm a bit lost at the dot product example atwhy are we multiplying a row of inputs on the same input node by a column of different weights? Wouldnt the value of each node in the next row be based on a column of inputs (the curent value of each node) * the weight of that each nodes connection to the next one?How many output nodes are there in this equation? - Backpropagation in 5 Minutes (tutorial)

I'm a bit lost at the dot product example atwhy are we multiplying a row of inputs on the same input node by a column of different weights? Wouldnt the value of each node in the next row be based on a column of inputs (the curent value of each node) * the weight of that each nodes connection to the next one?How many output nodes are there in this equation?

Backpropagation in 5 Minutes (tutorial)
2017年04月03日 
00:01:21 - 00:05:29
At  Layer1 should be a scalar value since you are performing the dot product, how do you then use it to calculate Layer2? - Backpropagation in 5 Minutes (tutorial)

At Layer1 should be a scalar value since you are performing the dot product, how do you then use it to calculate Layer2?

Backpropagation in 5 Minutes (tutorial)
2017年04月03日 
00:01:42 - 00:05:29
in ? (This looks like the same one that 3Blue1Brown uses to me) - Backpropagation in 5 Minutes (tutorial)

in ? (This looks like the same one that 3Blue1Brown uses to me)

Backpropagation in 5 Minutes (tutorial)
2017年04月03日 
00:02:00 - 00:05:29
@R. D. Machinery I actually had like  in mind, where the arrow heads for the tangent lines he draws should be on the other side of the curve. In fairness, at 2:52 he's plotting the derivative and the function together. - Backpropagation in 5 Minutes (tutorial)

@R. D. Machinery I actually had like in mind, where the arrow heads for the tangent lines he draws should be on the other side of the curve. In fairness, at 2:52 he's plotting the derivative and the function together.

Backpropagation in 5 Minutes (tutorial)
2017年04月03日 
00:02:28 - 00:05:29
for l =t1 = t1*(.999) - .001*(Delta1/n);b1 = b1*(.999) - .001*(Db1/n);t2 = t2*(.999) - .001*(Delta2/n);b2 = b2*(.999) - .001*(Db2/n);t3 = t3*(.999) - .001*(Delta3/n);b3 = b3*(.999) - .001*(Db3/n); - Backpropagation in 5 Minutes (tutorial)

for l =t1 = t1*(.999) - .001*(Delta1/n);b1 = b1*(.999) - .001*(Db1/n);t2 = t2*(.999) - .001*(Delta2/n);b2 = b2*(.999) - .001*(Db2/n);t3 = t3*(.999) - .001*(Delta3/n);b3 = b3*(.999) - .001*(Db3/n);

Backpropagation in 5 Minutes (tutorial)
2017年04月03日 
00:02:40 - 00:05:29
I went back and looked and yeah at  there's a straight line drawn really badly through the curve. It actually crosses two points on the curve quite some distance apart. lol - Backpropagation in 5 Minutes (tutorial)

I went back and looked and yeah at there's a straight line drawn really badly through the curve. It actually crosses two points on the curve quite some distance apart. lol

Backpropagation in 5 Minutes (tutorial)
2017年04月03日 
00:02:52 - 00:05:29
In the derivation function the (time approx. ) the slope is indeed 4 but the graph is incorrect (wrongly plotted black line representing the slope) - Backpropagation in 5 Minutes (tutorial)

In the derivation function the (time approx. ) the slope is indeed 4 but the graph is incorrect (wrongly plotted black line representing the slope)

Backpropagation in 5 Minutes (tutorial)
2017年04月03日 
00:02:53 - 00:05:29
In  I am a bit stuck. It's obvious f(x)' = 2 but the chain rule includes f(x)' applied to g(x) and g(x) = x², so why is there not an x² term in the derivative? It looks to me like it should be f'(g(x))g'(x) = 2(x²)2x. Where have I gone wrong following the formula? The actual calculation just looks like pulling out the constant 2 and using the simple power rule. I am confused about where the chain rule actually comes into this calculation. - Backpropagation in 5 Minutes (tutorial)

In I am a bit stuck. It's obvious f(x)' = 2 but the chain rule includes f(x)' applied to g(x) and g(x) = x², so why is there not an x² term in the derivative? It looks to me like it should be f'(g(x))g'(x) = 2(x²)2x. Where have I gone wrong following the formula? The actual calculation just looks like pulling out the constant 2 and using the simple power rule. I am confused about where the chain rule actually comes into this calculation.

Backpropagation in 5 Minutes (tutorial)
2017年04月03日 
00:03:25 - 00:05:29
Hello Siraj, around , you say that df/dx = (df/dx)*(dg/dx) , which is wrong, but in the screen it is stated correctly. - Backpropagation in 5 Minutes (tutorial)

Hello Siraj, around , you say that df/dx = (df/dx)*(dg/dx) , which is wrong, but in the screen it is stated correctly.

Backpropagation in 5 Minutes (tutorial)
2017年04月03日 
00:03:26 - 00:05:29
Small mistake in  - you said 'the derivative of f(g(x)) is equal to the derivative of f(x) times the derivative of g(x)' where you meant to say what is actually written in the slide - (f(g(x))' = f'(g(x))*g'(x) and not f'(x)*g'(x) - Backpropagation in 5 Minutes (tutorial)

Small mistake in - you said 'the derivative of f(g(x)) is equal to the derivative of f(x) times the derivative of g(x)' where you meant to say what is actually written in the slide - (f(g(x))' = f'(g(x))*g'(x) and not f'(x)*g'(x)

Backpropagation in 5 Minutes (tutorial)
2017年04月03日 
00:03:30 - 00:05:29
- best sketch ever. - Backpropagation in 5 Minutes (tutorial)

- best sketch ever.

Backpropagation in 5 Minutes (tutorial)
2017年04月03日 
00:04:23 - 00:05:29
Hey Siraj, Can you provide a mathematical proof for the equations at , especially for layer2_error - Backpropagation in 5 Minutes (tutorial)

Hey Siraj, Can you provide a mathematical proof for the equations at , especially for layer2_error

Backpropagation in 5 Minutes (tutorial)
2017年04月03日 
00:04:33 - 00:05:29
@ There is a bit of glossing over detail on part of the subject I see a number of confused people posting on Stack Overflow or Datascience Stack Exchange. Namely that you don't backpropagate the *error* value per se, but the gradient of the error with respect to a current parameter. This is made more confusing to many software devs implementing back propagation because usual design of neural nets is to cleverly combine the loss function and the output layer transform, so that the derivative is numerically equal to the error (specifically only at the pre-transform stage of the output layer). It really matters to understand the difference though because in the general case it is not true, and there are developers "cargo culting" in apparently magic manipulations of the error because they don't understand this small difference. - Backpropagation in 5 Minutes (tutorial)

@ There is a bit of glossing over detail on part of the subject I see a number of confused people posting on Stack Overflow or Datascience Stack Exchange. Namely that you don't backpropagate the *error* value per se, but the gradient of the error with respect to a current parameter. This is made more confusing to many software devs implementing back propagation because usual design of neural nets is to cleverly combine the loss function and the output layer transform, so that the derivative is numerically equal to the error (specifically only at the pre-transform stage of the output layer). It really matters to understand the difference though because in the general case it is not true, and there are developers "cargo culting" in apparently magic manipulations of the error because they don't understand this small difference.

Backpropagation in 5 Minutes (tutorial)
2017年04月03日 
00:04:40 - 00:05:29
Guess  should be the partial derivative of the activation function instead of simply derivative ??Correct me if in wrong - Backpropagation in 5 Minutes (tutorial)

Guess should be the partial derivative of the activation function instead of simply derivative ??Correct me if in wrong

Backpropagation in 5 Minutes (tutorial)
2017年04月03日 
00:04:47 - 00:05:29
in layer1_error calculation why weights1?, layer1 has weights0, so it should be weights0 - Backpropagation in 5 Minutes (tutorial)

in layer1_error calculation why weights1?, layer1 has weights0, so it should be weights0

Backpropagation in 5 Minutes (tutorial)
2017年04月03日 
00:04:48 - 00:05:29

Siraj Raval

※本サイトに掲載されているチャンネル情報や動画情報はYouTube公式のAPIを使って取得・表示しています。

Timetable

動画タイムテーブル

動画数:471件

. - I Built a Sports Betting Bot (WagerGPT)

.

I Built a Sports Betting Bot (WagerGPT)
2024年01月04日  @ckq 様 
00:02:22 - 00:05:48
that’s an outdated player roster from 3 years ago buddy - I Built a Sports Betting Bot (WagerGPT)

that’s an outdated player roster from 3 years ago buddy

I Built a Sports Betting Bot (WagerGPT)
2024年01月04日  @metalflames4 様 
00:02:22 - 00:05:48
fwiw they went 1-3 on those picks - I Built a Sports Betting Bot (WagerGPT)

fwiw they went 1-3 on those picks

I Built a Sports Betting Bot (WagerGPT)
2024年01月04日  @ckq 様 
00:02:40 - 00:05:48
😂 - I Built a Sports Betting Bot (WagerGPT)

😂

I Built a Sports Betting Bot (WagerGPT)
2024年01月04日  @tennisprotrader 様 
00:05:22 - 00:05:48
the lie. any apple m* processor has it, even many phone processor have - Deep Learning with 4th Gen Xeon Processors and Intel® Accelerator Engines (AWS re:Invent 2023)

the lie. any apple m* processor has it, even many phone processor have

Deep Learning with 4th Gen Xeon Processors and Intel® Accelerator Engines (AWS re:Invent 2023)
2023年12月14日  @somerndid 様 
00:02:01 - 00:04:50
@ you are showing what appears to be yet another medical dataset "medalpaca/medical_meadow_mediqa" but it is unclear how that is used. - DoctorGPT: Offline & Passes Medical Exams!

@ you are showing what appears to be yet another medical dataset "medalpaca/medical_meadow_mediqa" but it is unclear how that is used.

DoctorGPT: Offline & Passes Medical Exams!
2023年08月13日  Mark Woodworth 様 
00:13:12 - 00:18:13
@What exactly are you concatenating? You say "instruction column and input column into a single input" but the code references only the "question" column from the "GBaker/MedQA-USMLE-4-options" dataset.  The question is then submitted for inference as-is, without being combined with anything as far as I can tell. Also - are the options (answer choices) and correct_answer_idx (multiple choice answer) used anywhere? - DoctorGPT: Offline & Passes Medical Exams!

@What exactly are you concatenating? You say "instruction column and input column into a single input" but the code references only the "question" column from the "GBaker/MedQA-USMLE-4-options" dataset. The question is then submitted for inference as-is, without being combined with anything as far as I can tell. Also - are the options (answer choices) and correct_answer_idx (multiple choice answer) used anywhere?

DoctorGPT: Offline & Passes Medical Exams!
2023年08月13日  Mark Woodworth 様 
00:13:21 - 00:13:12
@ you mention SFT with the base model, but the code appears to be using the chat model - DoctorGPT: Offline & Passes Medical Exams!

@ you mention SFT with the base model, but the code appears to be using the chat model

DoctorGPT: Offline & Passes Medical Exams!
2023年08月13日  Mark Woodworth 様 
00:18:13 - 00:38:49