- Intermediate knowledge of Python programming
- Knowledge of PyTorch can be helpful
- Some knowledge of Machine Learning can be helpful
Welcome to the Introduction to Transformers for Large Language Models. Very recently, we saw a revolution with the advent of Large Language Models. It is rare that something changes the world of Machine Learning that much, and the hype around LLM is real! That's something that very few experts predicted, and it's essential to be prepared for the future.
This course is for Machine Learning enthusiasts who want to understand the inner workings of Transformer architecture. We are going to explore the different models that led to that discovery back in 2017. From the RNN Encoder-Decoder architecture, passing by the Bahdanau and Luong Attention mechanisms, up to the self-attention mechanism. We are going to dive into the strategy to parse text into tokens before feeding them to the LLMs and how LLMs can be tuned to generate text.
Each section will divided into the conceptual part and the coding part. I recommend digging into both aspects, but feel free to focus on the concepts or the coding if it matters more to you. I made sure to separate the two for learning flexibility. In the coding part, we are going to see how the different models are implemented in PyTorch, and we are going to explore some of the capabilities of the Transformers Python package by Hugging Face. However, this is not a PyTorch course, and I will not dive into the details of the framework.
Topics covered in that course:
- The RNN Encoder-Decoder Architecture
- The Attention Mechanism Before Transformers
- The Self-Attention Mechanism
- Understanding the Transformer Architecture
- How do we create Tokens from Words
- How LLMs Generate Text
- Transformers' applications beyond LLMs
Who this course is for:
- Machine Learning enthusiasts who want to improve their knowledge of Large Language Models
- Intermediate Python developers curious to learn the ins and outs of the Transformer architecture