v1.4
Quick start solutions:
Ray
- Released v1.2.0, supporting autoscaling RayClusters (#740) and adding reliability improvements (#723)
- Added a helm-chart (#745)
- Bump Ray TPU webhook image (#763)
Rag
- Update RAG fronend docker image in (#762)
TPU
- Add HuggingFace support for automated inference checkpoint conversion (#712)
- Jetstream Maxtext Deployment Module: All scale rules now in a single HPA in (#730)
- Update pip in JetStream Pytorch and checkpoint Dockerfiles in (#739)
- Fix faulty HPA in Jetstream Maxtext module in (#741)
- Correct tokenizer for Jetstream Module in (#742)
- Make image names optional in Jetstream Maxtext module in (#744)
- Terraform modules cleanup in (#758)
- TPU Metrics Improvements in (#727, #761, #770)
Benchmark
- update main README.md quickstart guide in (#734)
- Add Quantization support for TGI in (#757)
- Update README with the latest input variables in (#759)
Tutorials and Examples
- update image url for gemma finetune yaml in (#729)
- NIM on GKE Tutorial in (#737)
- Add Kueue exemplary setup for reservation and DWS in (#746)
Full Changelog: v1.3...v1.4