Skip to content

Releases: huggingface/text-embeddings-inference

v1.5.0

10 Jul 15:34
661a77f
Compare
Choose a tag to compare

Notable Changes

  • ONNX runtime for CPU deployments: greatly improve CPU deployment throughput
  • Add /similarity route

What's Changed

New Contributors

Full Changelog: v1.4.0...v1.5.0

v1.4.0

02 Jul 15:17
a0549e6
Compare
Choose a tag to compare

Notable Changes

  • Cuda support for the Qwen2 model architecture

What's Changed

  • feat(candle): support Qwen2 on Cuda by @OlivierDehaene in #316
  • fix(candle): fix last token pooling

Full Changelog: v1.3.0...v1.4.0

v1.3.0

28 Jun 11:37
6c6cd93
Compare
Choose a tag to compare

Notable changes

  • New truncation direction parameter
  • Cuda support for JinaCode model architecture
  • Cuda support for Mistral model architecture
  • Cuda support for Alibaba GTE model architecture
  • New prompt name parameter: you can now add a prompt name to the body of your request to add a pre-prompt to your input, based on the Sentence Transformers configuration. You can also set a default prompt / prompt name to always add a pre-prompt to your requests.

What's Changed

New Contributors

Full Changelog: v1.2.3...v1.3.0

v1.2.3

25 Apr 08:48
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.2.2...v1.2.3

v1.2.2

16 Apr 14:48
Compare
Choose a tag to compare

What's Changed

  • fix(gke): accept null values for vertex env vars by @OlivierDehaene in #243
  • fix: fix cpu image to not default on the sagemaker entrypoint

Full Changelog: v1.2.1...v1.2.2

v1.2.1

15 Apr 16:58
Compare
Choose a tag to compare

TEI is now Apache 2.0!

What's Changed

New Contributors

Full Changelog: v1.2.0...v1.2.1

v1.2.0

22 Mar 16:36
3edace2
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.1.0...v1.2.0

v1.1.0

01 Mar 17:06
Compare
Choose a tag to compare

Highlights

  • Splade pooling

What's Changed

New Contributors

Full Changelog: v1.0.0...v.1.1.0

v1.0.0

23 Feb 16:43
41b692d
Compare
Choose a tag to compare

Highlights

  • Support for Nomic models
  • Support for Flash Attention for Jina models
  • Metal backend for M* users
  • /tokenize route to directly access the internal TEI tokenizer
  • /embed_all route to allow client level pooling

What's Changed

New Contributors

Full Changelog: v0.6.0...v1.0.0

v0.6.0

30 Nov 14:28
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.5.0...v0.6.0