Skip to content

Latest commit

 

History

History
17 lines (14 loc) · 877 Bytes

README.md

File metadata and controls

17 lines (14 loc) · 877 Bytes

transformer-lab 🤖

A WIP project featuring implementation of GPT and related transformer models from scratch. Inspired by Andrej Karpathy's famous "Let's Build GPT" tutorial. This repository is actively evolving.

Features ✨

  • Decoder implementation of the Transformer
    • Multi-head self-attention (parallel processing)
    • The Transformer block: connection followed by computation
    • Text generator based on a context
  • Byte Pair Encoding (BPE) algorithm for tokenization, popularized by the GPT-2 paper

Roadmap 🎯

  • Custom dataset training
  • Pretraining and fine-tuning experiments
  • Implementation of "Encoder" block

Contributions and feedback are welcome!