Release v7.4.0 · neuml/txtai

This release adds the SQLite ANN, new text extraction features and a programming language neutral embeddings index format

See below for full details on the new features, improvements and bug fixes.

New Features

Add SQLite ANN (#780)
Enhance markdown support for Textractor (#758)
Update txtai index format to remove Python-specific serialization (#769)
Add new functionality to RAG application (#753)
Add bm25s library to benchmarks (#757) Thank you @a0346f102085fe9f!
Add serialization package for handling supported data serialization methods (#770)
Add MessagePack serialization as a top level dependency (#771)

Support <pre> blocks with Textractor (#749)
Update HF LLM to reduce noisy warnings (#752)
Update NLTK model downloads (#760)
Refactor benchmarks script (#761)
Update documentation to use base imports (#765)
Update examples to use RAG pipeline instead of Extractor when paired with LLMs (#766)
Modify NumPy and Torch ANN components to use np.load/np.save (#772)
Persist Embeddings index ids (only used when content storage is disabled) with MessagePack (#773)
Persist Reducer component with skops library (#774)
Persist NetworkX graph component with MessagePack (#775)
Persist Scoring component metadata with MessagePack (#776)
Modify vector transforms to load/save data using np.load/np.save (#777)
Refactor embeddings configuration into separate component (#778)
Document txtai index format (#779)

Translation: AttributeError: 'ModelInfo' object has no attribute 'modelId' (#750)
Change RAGTask to RagTask (#763)
Notebook 42 error (#768)