transformer-lab 🤖

A WIP project featuring implementation of GPT and related transformer models from scratch. Inspired by Andrej Karpathy's famous "Let's Build GPT" tutorial. This repository is actively evolving.

Features ✨

Decoder implementation of the Transformer
- Multi-head self-attention (parallel processing)
- The Transformer block: connection followed by computation
- Text generator based on a context
Byte Pair Encoding (BPE) algorithm for tokenization, popularized by the GPT-2 paper

Roadmap 🎯

Custom dataset training
Pretraining and fine-tuning experiments
Implementation of "Encoder" block

Contributions and feedback are welcome!