This repository has been archived by the owner on Jul 21, 2024. It is now read-only.

SwiftRank v0.1.1

synacktraa released this 17 Dec 15:42

· 7 commits to master since this release

3e25044

Initial Release

Features

🌟 Light Weight:

No Torch or Transformers: Operable solely on CPU.
Boasts the tiniest reranking model in the world, ~4MB.

⚡ Ultra Fast:

Reranking efficiency depends on the total token count in contexts and queries, plus the depth of the model (number of layers).
For illustration, the duration for the process using the standard model is exemplified in the following test:

🎯 Based on SoTA Cross-encoders and other models:

How good are Zero-shot rerankers? => Reference.
Supported Models :-
- ms-marco-TinyBERT-L-2-v2 (default)
- ms-marco-MiniLM-L-12-v2
- ms-marco-MultiBERT-L-12 (Multi-lingual, supports 100+ languages)
- rank-T5-flan (Best non cross-encoder reranker)
Why only sleeker models? Reranking is the final leg of larger retrieval pipelines, idea is to avoid any extra overhead especially for user-facing scenarios. To that end models with really small footprint that doesn't need any specialised hardware and yet offer competitive performance are chosen. Feel free to raise issues to add support for a new models as you see fit.

🔧 Versatile Configuration:

Implements a structured pipeline for the reranking process. Ranker and Tokenizer instances are passed to create the pipeline.
Supports complex dictionary objects handling.
Includes a customizable threshold parameter to filter contexts, ensuring only those with a value equal to or exceeding the threshold are selected.

⌨️ Terminal Integration:

Pipe your output into swiftrank cli tool and get reranked output

Assets 2