よく話題になっている単語
動画数:1254件
Building an LLM fine-tuning Dataset
Going through the building of a QLoRA fine-tuning dataset for a language model.
NVIDIA GTC signup: https://nvda.ws/3XTqlB6
Fine-tuning code: https://github.com/Sentdex/LLM-Finetuning
5000-step Walls1337bot adapter: https://huggingface.co/Sentdex/Walls1337bot-Llama2-7B-003.005.5000
WSB Dataset: https://huggingface.co/datasets/Sentdex/WSB-003.005
"I have every reddit comment" original reddit post and torrent info: https://www.reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment/
2007-2015 Reddit Archive.org: https://archive.org/download/2015_reddit_comments_corpus/reddit_data/
Reddit BigQuery 2007-2019 (this has other data besides reddit comments too!): https://reddit.com/r/bigquery/comments/3cej2b/17_billion_reddit_comments_loaded_on_bigquery/
Contents:
0:00 - Introduction to Dataset building for fine-tuning.
02:53 - The Reddit dataset options (Torrent, Archive.org, BigQuery)
06:07 - Exporting BigQuery Reddit (and some other data)
14:44 - Decompressing all of the gzip archives
25:13 - Re-combining the archives for target subreddits
28:29 - How to structure the data
40:40 - Building training samples and saving to database
48:49 - Creating customized training json files
54:11 - QLoRA training and results
Neural Networks from Scratch book: https://nnfs.io
Channel membership: https://www.youtube.com/channel/UCfzlCWGWYyIQ0aLC5w48gBQ/join
Discord: https://discord.gg/sentdex
Reddit: https://www.reddit.com/r/sentdex/
Support the content: https://pythonprogramming.net/support-donate/
Twitter: https://twitter.com/sentdex
Instagram: https://instagram.com/sentdex
Facebook: https://www.facebook.com/pythonprogramming.net/
Twitch: https://www.twitch.tv/sentdex
#python #programming
2024年03月07日
00:00:00 - 01:01:55
Visualizing Neural Network Internals
Visualizing some of the internals of a neural network during training and inference.
Starting and full code: https://github.com/Sentdex/neural-net-internals-visualized
Neural Networks from Scratch book: https://nnfs.io
Channel membership: https://www.youtube.com/channel/UCfzlCWGWYyIQ0aLC5w48gBQ/join
Discord: https://discord.gg/sentdex
Reddit: https://www.reddit.com/r/sentdex/
Support the content: https://pythonprogramming.net/support-donate/
Twitter: https://twitter.com/sentdex
Instagram: https://instagram.com/sentdex
Facebook: https://www.facebook.com/pythonprogramming.net/
Twitch: https://www.twitch.tv/sentdex
#python #programming
2024年02月15日
00:00:00 - 00:53:41
Getting Back on Grid
Establishing an internet connection in an internet desert, then figuring out (well, starting to) networking.
Combined with Starlink as my internet provider, I ended up going with a wifi bridge implementation with a couple of Ubiquiti nanostation AC locos to network between buildings at 100+ meters of distance. The Ubiquiti units can also do point to point (ptp), but so far the wifi bridge setup is working great for me.
Ubiquiti NanoStation 5AC Locos (buy in pairs for ptp/wifi bridge): https://amzn.to/3UqnLnQ
Mounting hardware I used, but you can use just about anything, including zip tying to a tree or something: https://amzn.to/42ycS5d
PoE Injectors (can use any PoE switch too): https://amzn.to/482oNJO
Silicone sealant: https://amzn.to/42vu5w9
For shorter distances, you can also use:
TPLink Access Points (AP): https://amzn.to/3OCe6qp
I also have enjoyed the 2016 model years of the Google wifi: https://amzn.to/495Ydkm These are half the price of the newer version, the nest variant: https://amzn.to/3HSDdBM
If I forgot something, feel free to ask!
Neural Networks from Scratch book: https://nnfs.io
Channel membership: https://www.youtube.com/channel/UCfzlCWGWYyIQ0aLC5w48gBQ/join
Discord: https://discord.gg/sentdex
Reddit: https://www.reddit.com/r/sentdex/
Support the content: https://pythonprogramming.net/support-donate/
Twitter: https://twitter.com/sentdex
Instagram: https://instagram.com/sentdex
Facebook: https://www.facebook.com/pythonprogramming.net/
Twitch: https://www.twitch.tv/sentdex
#python #programming
2024年02月08日
00:00:00 - 00:21:09
Open Source AI Inference API w/ Together
Exploring the Together Inference API (https://www.together.ai/)
Together API basics jupyter notebook examples: https://github.com/Sentdex/Together-API-Basics
Neural Networks from Scratch book: https://nnfs.io
Channel membership: https://www.youtube.com/channel/UCfzlCWGWYyIQ0aLC5w48gBQ/join
Discord: https://discord.gg/sentdex
Reddit: https://www.reddit.com/r/sentdex/
Support the content: https://pythonprogramming.net/support-donate/
Twitter: https://twitter.com/sentdex
Instagram: https://instagram.com/sentdex
Facebook: https://www.facebook.com/pythonprogramming.net/
Twitch: https://www.twitch.tv/sentdex
#python #programming
2023年12月25日
00:00:00 - 00:25:25
INFINITE Inference Power for AI
Testing and enjoying the Comino Grando Server machine with 6x RTX 4090s from Comino (https://www.comino.com/)
Neural Networks from Scratch book: https://nnfs.io
Channel membership: https://www.youtube.com/channel/UCfzlCWGWYyIQ0aLC5w48gBQ/join
Discord: https://discord.gg/sentdex
Reddit: https://www.reddit.com/r/sentdex/
Support the content: https://pythonprogramming.net/support-donate/
Twitter: https://twitter.com/sentdex
Instagram: https://instagram.com/sentdex
Facebook: https://www.facebook.com/pythonprogramming.net/
Twitch: https://www.twitch.tv/sentdex
#python #programming
2023年12月17日
00:00:00 - 00:18:02
Pandas Dataframes on your GPU w/ CuDF
An overview and some quick examples of using CuDF's Pandas accelerator and how much faster it can be than vanilla Pandas for data analysis.
Colab demo of Rapids: https://nvda.ws/3LWggQj
AI and Data Science Virtual Summit: https://nvda.ws/3ZR3wjL
Notebook in this video: https://gist.github.com/Sentdex/469c30385d06719519af13125db85edc
Install CuDF: pip install cudf-cu11 --extra-index-url=https://pypi.nvidia.com (or cu12)
Neural Networks from Scratch book: https://nnfs.io
Channel membership: https://www.youtube.com/channel/UCfzlCWGWYyIQ0aLC5w48gBQ/join
Discord: https://discord.gg/sentdex
Reddit: https://www.reddit.com/r/sentdex/
Support the content: https://pythonprogramming.net/support-donate/
Twitter: https://twitter.com/sentdex
Instagram: https://instagram.com/sentdex
Facebook: https://www.facebook.com/pythonprogramming.net/
Twitch: https://www.twitch.tv/sentdex
#python #programming
2023年11月11日
00:00:00 - 00:12:04
QLoRA is all you need (Fast and lightweight model fine-tuning)
Learning and sharing my process with QLoRA (quantized low rank adapters) fine-tuning. In this case, I use a custom-made reddit dataset, but you can use anything you want.
I referenced a LOT of stuff in this video, I will do my best to link everything, but let me know if I forget anything.
Resources:
WSB-GPT-7B Model: https://huggingface.co/Sentdex/WSB-GPT-7B
WSB-GPT-13B Model: https://huggingface.co/Sentdex/WSB-GPT-13B
WSB Training data: https://huggingface.co/datasets/Sentdex/wsb_reddit_v002
Code:
QLoRA Repo: https://github.com/artidoro/qlora
qlora.py: https://github.com/artidoro/qlora/blob/main/qlora.py
Simple qlora training notebook: https://colab.research.google.com/drive/1VoYNfYDKcKRQRor98Zbf2-9VQTtGJ24k?usp=sharing
qlora merging/dequantizing code: https://gist.github.com/ChrisHayduk/1a53463331f52dca205e55982baf9930
Referenced Research Papers:
Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning: https://arxiv.org/abs/2012.13255
LoRA: Low-Rank Adaptation of Large Language Models: https://arxiv.org/abs/2106.09685
QLoRA: Efficient Finetuning of Quantized LLMs: https://arxiv.org/abs/2305.14314
Yannic's GPT-4chan model: https://huggingface.co/ykilcher/gpt-4chan
Condemnation letter: https://docs.google.com/forms/d/e/1FAIpQLSdh3Pgh0sGrYtRihBu-GPN7FSQoODBLvF7dVAFLZk2iuMgoLw/viewform
https://www.youtube.com/watch?v=efPrtcLdcdM
Contents:
0:00 - Why QLoRA?
0:55 - LoRA/QLoRA Research
4:13 - Fine-tuning dataset
11:10 - QLoRA Training Process
15:02 - QLoRA Adapters
17:10 - Merging, Dequantizing, and Sharing
19:34 - WSB QLoRA fine-tuned model examples
Neural Networks from Scratch book: https://nnfs.io
Channel membership: https://www.youtube.com/channel/UCfzlCWGWYyIQ0aLC5w48gBQ/join
Discord: https://discord.gg/sentdex
Reddit: https://www.reddit.com/r/sentdex/
Support the content: https://pythonprogramming.net/support-donate/
Twitter: https://twitter.com/sentdex
Instagram: https://instagram.com/sentdex
Facebook: https://www.facebook.com/pythonprogramming.net/
Twitch: https://www.twitch.tv/sentdex
#python #programming
2023年09月16日
00:00:00 - 00:23:56
Chat Interface for your Local Llama LLMs
A tutorial of sorts covering how to create streaming chat interfaces using Gradio for the various chat/instruct large language models from HuggingFace.
Sample code: https://huggingface.co/spaces/Sentdex/StableBeluga-7B-Chat/blob/main/app.py
Neural Networks from Scratch book: https://nnfs.io
Channel membership: https://www.youtube.com/channel/UCfzlCWGWYyIQ0aLC5w48gBQ/join
Discord: https://discord.gg/sentdex
Reddit: https://www.reddit.com/r/sentdex/
Support the content: https://pythonprogramming.net/support-donate/
Twitter: https://twitter.com/sentdex
Instagram: https://instagram.com/sentdex
Facebook: https://www.facebook.com/pythonprogramming.net/
Twitch: https://www.twitch.tv/sentdex
#python #programming
2023年08月23日
00:00:00 - 00:15:56
Gzip is all You Need! (This SHOULD NOT work)
Github code: https://github.com/Sentdex/Simple-kNN-Gzip
Neural Networks from Scratch book: https://nnfs.io
Channel membership: https://www.youtube.com/channel/UCfzlCWGWYyIQ0aLC5w48gBQ/join
Discord: https://discord.gg/sentdex
Reddit: https://www.reddit.com/r/sentdex/
Support the content: https://pythonprogramming.net/support-donate/
Twitter: https://twitter.com/sentdex
Instagram: https://instagram.com/sentdex
Facebook: https://www.facebook.com/pythonprogramming.net/
Twitch: https://www.twitch.tv/sentdex
#python #programming
2023年07月29日
00:00:00 - 00:19:47
Better Attention is All You Need
Addressing the current state of attention for artificial intelligence and why it's currently holding back maximum context lengths.
Neural Networks from Scratch book: https://nnfs.io
Channel membership: https://www.youtube.com/channel/UCfzlCWGWYyIQ0aLC5w48gBQ/join
Discord: https://discord.gg/sentdex
Reddit: https://www.reddit.com/r/sentdex/
Support the content: https://pythonprogramming.net/support-donate/
Twitter: https://twitter.com/sentdex
Instagram: https://instagram.com/sentdex
Facebook: https://www.facebook.com/pythonprogramming.net/
Twitch: https://www.twitch.tv/sentdex
#python #programming
2023年07月12日
00:00:00 - 00:14:29