For masking, is there a strategy to remove words instead of random masking, as if the object of interest, eg: curtain @ were to be removed from both English and French, wouldn't it make the prediction task much more difficult, as a lot of objects could be substituted in its place.(01:29:19 - 02:00:29)
09L – Differentiable associative memories, attention, and transformers
Alfredo Canziani
※本サイトに掲載されているチャンネル情報や動画情報はYouTube公式のAPIを使って取得・表示しています。
Timetable
動画タイムテーブル