Autoplay
Autocomplete
Previous Lesson
Complete and Continue
Train, Fine-tune and Deploy LLMs - Bootcamp
Welcome!
Curriculum
Tools
Join Discord
Tell me more about you!
Schedule
The Transformer Architecture
Thursday Aug 15th Meeting (202:17)
Friday Aug 16th Meeting (206:48)
Intro (1:33)
The Overall Architecture (11:19)
The Self-Attention layer (12:20)
The Multihead Attention Layer (3:56)
The Position Embedding (13:07)
The Encoder (5:23)
The Decoder (6:36)
Implementing the Self-Attention Layer (15:24)
Implementing the Multihead Attention Layer (13:03)
Implementing the Position Embedding (4:16)
Implementing the Feed-Forward Network (1:54)
Implementing The Encoder Block (2:44)
Implementing the Encoder (2:58)
Implementing the Decoder Block (3:56)
Implementing the Decoder (3:12)
Implementing the Transformer (2:35)
Testing the Code (8:04)
Outro (0:47)
Homework 1
Homework 1 Feedback
Training LLMs to Follow Instructions
Thursday Aug 22nd Meeting (233:58)
Friday Aug 23rd Meeting (244:17)
Intro (1:00)
The Overview (5:55)
Causal Language Modeling Pretraining (10:58)
Supervised Learning Fine-Tuning (7:46)
Reinforcement Learning with Human Feedback (15:09)
Implementing the Pretraining Step (21:54)
Implementing the Supervised Learning Fine-Tuning Step (9:53)
Implementing the Reinforcement Learning Fine-Tuning Step (28:25)
Outro (0:26)
Homework 2
Homework 2 Feedback
How to Scale Model Training
Thursday Aug 29th Meeting (259:18)
Friday Aug 30th Meeting (213:33)
Intro (1:22)
CPU vs GPU vs TPU (5:49)
The GPU Architecture (8:03)
Distributed Training (2:38)
Data parallelism (3:59)
Model parallelism (7:38)
Zero Redundancy Optimizer Strategy (10:49)
Distributing Training with the Accelerate Package on AWS Sagemaker (38:08)
Outro (0:27)
Homework 3
Homework 3 Feedback
How to Fine-Tune LLMs
Thursday Sep 5th Meeting (250:04)
Friday Sep 6th Meeting (228:21)
Intro (1:05)
The Different Fine-tuning tasks (4:23)
Language Modeling (8:16)
Sequence Prediction (5:03)
Text Classification (4:16)
Text Encoding (5:17)
Multimodal Fine-tuning (2:39)
Catastrophic forgetting (1:47)
LoRA Adapters (11:35)
QLoRA (19:24)
LoRA and QLoRA with the PEFT Package (22:25)
Outro (0:36)
Homework 4
Homework 4 Feedback
How to Deploy LLMs
Thursday Sep 12th Meeting (226:06)
Friday Sep 13th Meeting (199:38)
Intro (0:59)
Before Deploying (9:13)
The Deployment Strategies (8:59)
Multi-LoRA (2:51)
The Text Generation Layer (13:13)
Streaming Applications (5:25)
Continuous Batching (6:13)
KV-Caching (11:18)
Deploying with vLLM (9:58)
Outro (0:22)
Homework 5
Building the Application Layer
Thursday Sep 19th Meeting (215:38)
Friday Sep 20th Meeting (200:11)
Intro (1:39)
What is the Application Layer (5:21)
The RAG Application (4:19)
Optimizing the Indexing Pipeline (6:10)
Optimizing the Query (4:52)
Optimizing the Retrieval (5:00)
Optimizing the Document Selection (8:56)
Optimizing the Context Creation (8:39)
Building a simple RAG Application (1:28)
Implementing the Indexing Pipeline (35:04)
Implementing the Retrieval API (24:19)
Homework 6
Homework 6 Feedback
Outro (0:32)
Outro
Friday Sep 27th Meeting - Free form final Session
CPU vs GPU vs TPU
Lesson content locked
If you're already enrolled,
you'll need to login
.
Enroll in Course to Unlock