Transformers Neuron for Trn1 and Inf2 is a software package that enables PyTorch users to perform large language model (LLM) inference on second-generation Neuron hardware (See: NeuronCore-v2).
Please refer to the Transformers Neuron documentation for setup and developer guides.
To install the most rigorously tested stable release, use the PyPI pip wheel:
pip install transformers-neuronx --extra-index-url=https://pip.repos.neuron.amazonaws.com
The AWS Neuron team is currently restructuring the contribution model of this github repository. This github repository content does not reflect latest features and improvements of transformers-neuronx library. Please install the stable release version from https://pip.repos.neuron.amazonaws.com to get latest features and improvements.
Please refer to the transformers-neuronx release notes to see the latest supported features and models.
Please refer to our Contact
Us
page for additional information and support resources. If you intend to
file a ticket and you can share your model artifacts, please re-run your
failing script with NEURONX_DUMP_TO=./some_dir
. This will dump
compiler artifacts and logs to ./some_dir
. You can then include this
directory in your correspondance with us. The artifacts and logs are
useful for debugging the specific failure.
See CONTRIBUTING for more information.
This library is licensed under the Apache License 2.0 License.