English | 中文
The ByteIR Project is a ByteDance model compilation solution. ByteIR includes compiler, runtime, and frontends, and provides an end-to-end model compilation solution.
Although all ByteIR components (compiler/runtime/frontends) are together to provide an end-to-end solution, and all under the same umbrella of this repository, each component technically can perform independently.
The name, ByteIR, comes from a legacy purpose internally.
The ByteIR project is NOT an IR spec definition project.
Instead, in most scenarios, ByteIR directly uses several upstream MLIR dialects and Google Mhlo.
Most of ByteIR compiler passes are compatible with the selected upstream MLIR dialects and Google Mhlo.
- Enjoy SOTA models: ByteIR maintains the popular frontends to handle lowering many SOTA models into Stablehlo, and also provides a model zoo (release soon) for research or benchmarking purposes.
- Just work: ByteIR adopts upstream MLIR dialects and Google Mhlo, and provides compatible passes, utilities, and infrastructure for all compiler builders using upstream MLIR. You can mix using ByteIR passes with upstream MLIR or Mhlo passes, or even your own passes to build your pipeline.
- Bring your own architecture: ByteIR provides rich generic graph-, loop-, tensor-level, optimizations in Mhlo and Linalg, which allow DL ASIC compilers to reuse, and focus only on the last mile for their backends.
ByteIR is still in its early phase. In this phase, we are aiming to provide well-defined, necessary building blocks and infrastructure support for model compilation in a wide-range of deep learning accelerators as well as general-purpose CPUs and GPUs. Therefore, highly-tuned kernels for specific architecture might not have been prioritized. For sure, any feedback for prioritizing specific architecture or corresponding contribution are welcome.
ByteIR Compiler is an MLIR-based compiler for CPU/GPU/ASIC.
ByteIR Runtime is a common, lightweight runtime, capable to serving both existing kernels and ByteIR compiler generated kernels.
ByteIR Frontends includes Tensorflow, PyTorch, and ONNX.
Each ByteIR component technically can perform independently. There are pre-defined communication interface between components.
ByteIR frontends and ByteIR compiler communicate through Stablehlo dialect, which version might be updated during development.
This also implies whatever frontend generating Stablehlo with a compatible version can work with ByteIR compiler, and also whatever compiler consuming Stablehlo with a compatible version can work with ByteIR frontends.
ByteIR compiler and ByteIR runtime communicates through ByRE format, which version might be updated during development. ByRE dialect is defined as a kind of ByRE format in ByteIR compiler, currently supporting emitting a textual form or bytecode with versioning for ByteIR compiler and runtime.
Other ByRE formats are under development.
ByteIR is the product of many great researchers and interns in ByteDance. Below is a list of our public talks:
- Linalg is All You Need to Optimize Attention -- C4ML'23
- ByteIR: Towards End-to-End AI Compilation -- China SoftCon'23
If you find ByteIR useful, please consider citing.
@misc{byteir2023,
title = {{ByteIR}},
author = {Cao, Honghua and Chang, Li-Wen and Chen, Chongsong and Jiang, Chengquan and Jiang, Ziheng and Liu, Liyang and Liu, Yuan and Liu, Yuanqiang and Shen, Chao and Wang, Haoran and Xiao, Jianzhe and Yao, Chengji and Yuan, Hangjian and Zhang, Fucheng and Zhang, Ru and Zhang, Xuanrun and Zhang, Zhekun and Zhang, Zhiwei and Zhu, Hongyu and Liu, Xin},
url = {https://github.com//bytedance/byteir},
year = {2023}
}
The ByteIR Project is under the Apache License v2.0