you are updating state and applying the action. When we chose an action, first we need to apply than we need to update the state and get the reward. Let's say current price is 100.20. When agent decides to buy, it's has to buy from the price 100.20 (excluding spread/slippage and commission). In your example, it's buying with the next price. Am I wrong?(00:30:14 - 00:43:34)
Reinforcement Learning for Trading Practical Examples and Lessons Learned by Dr. Tom Starke
Quantopian
※本サイトに掲載されているチャンネル情報や動画情報はYouTube公式のAPIを使って取得・表示しています。
Timetable
動画タイムテーブル