Transformer Implementation from Scratch

Build and understand Transformer architecture through hands-on implementation.

🎯 Project Overview

This repository contains a step-by-step implementation of the Transformer architecture as described in "Attention Is All You Need". The goal is to deeply understand each component through practical implementation.

📂 Repository Structure

transformer/
│
├── src/
│   ├── models/
│   │   ├── __init__.py
│   │   ├── attention.py        # Attention mechanisms
│   │   ├── encoder.py          # Transformer encoder
│   │   ├── decoder.py          # Transformer decoder
│   │   ├── embeddings.py       # Token and positional embeddings
│   │   ├── feed_forward.py     # Position-wise feed-forward networks
│   │   └── transformer.py      # Full transformer model
│   │
│   ├── utils/
│   │   ├── __init__.py
│   │   ├── config.py           # Model configuration
│   │   ├── data_loader.py      # Data loading utilities
│   │   ├── metrics.py          # Evaluation metrics
│   │   └── visualization.py    # Attention visualization tools
│   │
│   └── training/
│       ├── __init__.py
│       ├── trainer.py          # Training loop
│       ├── optimizer.py        # Custom optimizers
│       └── scheduler.py        # Learning rate scheduling
│
├── tests/
│   ├── __init__.py
│   ├── test_attention.py
│   ├── test_encoder.py
│   └── test_decoder.py
│
├── notebooks/
│   ├── 1_attention_mechanism.ipynb
│   ├── 2_encoder_implementation.ipynb
│   └── 3_decoder_implementation.ipynb
│
├── configs/
│   └── default_config.yaml
│
├── requirements.txt
├── setup.py
├── .gitignore
├── LICENSE
└── README.md

📚 Implementation Phases

Phase 1: Core Components

[Full implementation phases listed in progress tracking...]

🚀 Getting Started

Prerequisites

Python 3.8+
PyTorch 2.0+

Installation

git clone https://github.com/yourusername/transformer-implementation.git
cd transformer-implementation
pip install -r requirements.txt

📖 Learning Resources

--

configs/default_config.yaml

model:
  d_model: 512
  n_heads: 8
  n_layers: 6
  d_ff: 2048
  dropout: 0.1
  max_seq_length: 512

training:
  batch_size: 32
  learning_rate: 0.0001
  warmup_steps: 4000
  max_epochs: 100
  gradient_clip_val: 1.0
  label_smoothing: 0.1

data:
  vocab_size: 32000
  train_file: "data/train.txt"
  val_file: "data/val.txt"
  test_file: "data/test.txt"

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.vscode		.vscode
transformer		transformer
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transformer Implementation from Scratch

🎯 Project Overview

📂 Repository Structure

📚 Implementation Phases

Phase 1: Core Components

🚀 Getting Started

Prerequisites

Installation

📖 Learning Resources

configs/default_config.yaml

About

Releases

Packages

Languages

james92kj/Attention-Is-All-You-Need

Folders and files

Latest commit

History

Repository files navigation

Transformer Implementation from Scratch

🎯 Project Overview

📂 Repository Structure

📚 Implementation Phases

Phase 1: Core Components

🚀 Getting Started

Prerequisites

Installation

📖 Learning Resources

configs/default_config.yaml

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages