- Performance and Memory efficient, including benchmarks. 0.25ms to decode GPT2.
- Keep the tensor ordering as specified in the safetensors file.
- Simple API
This package is inspired on the work of the NLP Odyssey Authors, which was itself inspired by Hugging Face's original Rust implementation.