Releases: ml6team/fondant
0.8.dev6
What's Changed
- Add load_with_llamahub component by @RobbeSneyders in #719
- Update hub links in documentation after adding new components by @RobbeSneyders in #720
- Fix data io for generic PandasTransformComponent by @RobbeSneyders in #721
- Fix environment variable setting in embed_text component by @Hakimovich99 in #722
Full Changelog: 0.8.dev5...0.8.dev6
0.8.dev5
What's Changed
- Add compile and run interface to kfp based runners by @GeorgesLorre in #704
- Fix Laion prompt component indexing by @GeorgesLorre in #711
- Add support for resources with Sagemaker by @GeorgesLorre in #710
- Update sagemaker docs by @GeorgesLorre in #701
- Move to modules instead of classes for home page and common interface by @RobbeSneyders in #717
- Reactivate readme generation pre-commit by @RobbeSneyders in #718
- Add load_from_csv component by @Hakimovich99 in #713
- Add retrieve_from_weaviate component by @Hakimovich99 in #714
- Fix/sagemaker script generator by @GeorgesLorre in #712
- Add evaluate_ragas component by @Hakimovich99 in #715
Full Changelog: 0.8.dev4...0.8.dev5
0.8.dev4
What's Changed
- Add functionality for pullthrough cache rule creation and URI patching by @GeorgesLorre in #697
- Support pipeline factory functions as CLI reference by @RobbeSneyders in #699
- Add logic to handle custom components by @GeorgesLorre in #700
- Move to datasets & apply interface by @RobbeSneyders in #685
Full Changelog: 0.8.dev3...0.8.dev4
0.8.dev3
What's Changed
- Explorer search by @PhilippeMoussalli in #691
- Feature/build 2 ecr by @GeorgesLorre in #686
- bugfix old getting started link that had 404 error by @NSFF in #694
- Explorer improve filtering of available runs by @PhilippeMoussalli in #693
- Set default explorer version in python sdk by @RobbeSneyders in #692
- Compile absolute path for custom components by @RobbeSneyders in #696
- Build to AWS ECR on release by @RobbeSneyders in #698
New Contributors
Full Changelog: 0.8.dev2...0.8.dev3
0.8.dev2
What's Changed
- Fix column names in chunk_text component by @RobbeSneyders in #676
- Set default explorer version to current Fondant version by @RobbeSneyders in #681
- Augment SagemakerRunner to support running from pipeline objects by @GeorgesLorre in #678
- Hide partitions from users by @PhilippeMoussalli in #677
- Explorer new dataset format by @PhilippeMoussalli in #682
- Use cleaner field names in reusable components by @RobbeSneyders in #679
- Add cli commands for sagemaker by @GeorgesLorre in #680
Full Changelog: 0.8.dev1...0.8.dev2
0.8.dev1
What's Changed
- Fix output dataframe path by @RobbeSneyders in #675
Full Changelog: 0.8.dev0...0.8.dev1
0.8.dev0
What's Changed
- Update fondant_component.yaml by @Hakimovich99 in #647
- feat: Qdrant support by @Anush008 in #646
- Feature/sagemaker compiler by @GeorgesLorre in #662
- Restructure data explorer by @PhilippeMoussalli in #657
- Feature/sagemaker runner by @GeorgesLorre in #664
- Add document viewer to dataset explorer by @PhilippeMoussalli in #666
- Fix cli creds by @PhilippeMoussalli in #669
- Redesign dataset format by @RobbeSneyders in #672
- Explorer front page by @PhilippeMoussalli in #671
- Regenerate qdrant readme by @RobbeSneyders in #673
- Augment DockerRunner to support running from a fondant Pipeline by @GeorgesLorre in #651
- Update tag pattern in prep-release pipeline to match dev versions by @RobbeSneyders in #674
New Contributors
Full Changelog: 0.7.0...0.8.dev0
0.7.0
Highlights
-
We restructured and updated our documentation, which should make it easier to get started, and learn more advanced concepts as you go.
👉 Check it out at fondant.ai!
-
We moved our example pipelines into separate repositories, which will make it easier to get started with them:
-
📖 RAG ingestion pipeline
An end-to-end Fondant pipeline that prepares documents for a RAG (Retrieval Augmented Generation) system by chunking and embedding them, and writing them to a vector store. -
🛋️ ControlNet Interior Design Pipeline
An end-to-end Fondant pipeline to collect and process data for the fine-tuning of a ControlNet model, focusing on images related to interior design. -
🖼️ Filter creative common license images
An end-to-end Fondant pipeline that starts from our Fondant-CC-25M creative commons image dataset and filters and downloads the desired images. -
🔢 Datacomp pipeline
An end-to-end Fondant pipeline filtering image-text data to train a CLIP model for the DataComp competition.
-
-
We split our component and pipeline SDK, so only the actual required dependencies are installed.
In components, install the
component
extra so you can use thefondant.component
SDK.pip install fondant[component]
Locally, just install Fondant without extras to
fondant.pipeline
SDK and CLI.For the local runner:
pip install fondant
Or with the appropriate extra for your specific runner:
pip install fondant[vertex]
More info on the available installation options is available here.
All changes
- bugfix typo in text_normalization by @andres-vv in #542
- Fondant build - add test case and documentation by @mrchtr in #546
- Add search bar to the documentation by @mrchtr in #545
- Fix cli error propagation by @PhilippeMoussalli in #544
- Remove Starcoder pipeline by @PhilippeMoussalli in #552
- Simplify cloud credentials mounting by @PhilippeMoussalli in #548
- Bump gcsfs by @PhilippeMoussalli in #553
- Remove starcoder pipeline reference from README.md by @mrchtr in #556
- Set mkdocs site_url to fix 404 page by @RobbeSneyders in #559
- Add screenshots data explorer by @mrchtr in #555
- Split component and pipeline SDKs by @RobbeSneyders in #587
- Datacomp submission improvement by @PhilippeMoussalli in #586
- Update readme generator reference to new sdk by @PhilippeMoussalli in #591
- Add componentOp warning for unused configuration by @PhilippeMoussalli in #551
- Add component install extra and update others by @RobbeSneyders in #592
- Fixing pre-commit building component READMEs by @mrchtr in #596
- Granular compiler tests by @PhilippeMoussalli in #589
- Fix pipeline label by @PhilippeMoussalli in #606
- Fix naming error by @PhilippeMoussalli in #607
- Remove example pipelines by @mrchtr in #588
- Add simple pipeline and integration test for the LocalRunner by @mrchtr in #594
- Delete datacomp pipeline by @mrchtr in #612
- Restructure documentation by @PhilippeMoussalli in #597
- Add more info about caching by @PhilippeMoussalli in #615
- Update langchain version by @mrchtr in #617
- Update hub by @PhilippeMoussalli in #609
- Add Content Tab to document python and Console SDK by @PhilippeMoussalli in #613
- Docs site improvements by @RobbeSneyders in #620
- Expand docker installation guide by @PhilippeMoussalli in #619
- publishing Components by @PhilippeMoussalli in #621
- Update README by @RobbeSneyders in #622
- Use names instead of directory names in hub by @RobbeSneyders in #626
- Add label argument to fondant build by @mrchtr in #623
- Update name by @PhilippeMoussalli in #630
- Remove faq by @PhilippeMoussalli in #629
- Small formatting fixes by @PhilippeMoussalli in #614
- Add architecture description and plot by @PhilippeMoussalli in #628
- Skip followed imports for referencing.jsonschema by @RobbeSneyders in #644
- Add authentification arg to run command by @PhilippeMoussalli in #645
- Small doc fixes by @PhilippeMoussalli in #648
- Remove explicit fsspec requirements from data explorer by @RobbeSneyders in #650
New Contributors
- @andres-vv made their first contribution in #542
Full Changelog: 0.6.2...0.7.0
0.6.2
Make docker connection during fondant build
command more robust on Mac.
What's Changed
Full Changelog: 0.6.1...0.6.2
0.6.1
This is a re-packaged release of version 0.6.1.
Version 0.6.0 packaged an older commit due to a bug in our release system.
Highlights
-
Vertex AI is now supported as a backend for pipeline execution.
Simply run
fondant run vertex <pipeline.py>
to submit your pipeline.
Runfondant run vertex --help
to see the possible configuration options. -
The reusable components are now available on DockerHub under the
fndnt
organization.DockerHub is supported more broadly than Github container registry which we were using before.
-
Previously executed components are now cached when re-executed with the same arguments.
- This makes it easier to iterate on development of down-stream components
- This allows you to resume failed pipelines from their failed step
-
Added
fondant build
command which let's you build fondant components easilyRun
fondant build <component_dir>
. Checkfondant build -h
for options.
The command will also update the image reference in thefondant_component.yaml
to the newly built one. -
We migrated from KfP v1 to KfP v2. This means:
- We now benefit from the latest KfP developments
- We compile fondant pipelines to the IR YAML format, which is supported by other execution engines such as Vertex
- You need a KfP v2 cluster to run fondant pipelines
Fixes
- Fix data explorer for usage on Windows
- Fix propagation of
client_kwargs
argument to configure Dask Client
Components
- Every reusable component now has a clear README describing its usage
- Add
load_from_parquet
component to load parquet files as input data - Add
embed_text
component to embed documents and other text - Add
chunk_text
component to chunk documents into passages - Add
index_weaviate
component to index data in a weaviate vector store - Fix issue with mixed type ids in LAION retrieval components
- Improve success rate of
download_images
component - Fix OOM issues for inference components using GPU
- Limit data read by
load_from_hub
component to used columns
Detailed changes
- Add contribution segment by @GeorgesLorre in #463
- Update sample pipeline by @mrchtr in #464
- Update project description by @RobbeSneyders in #465
- Disable caching in the image retrieval sample pipeline by @mrchtr in #467
- Improve download images logs by @PhilippeMoussalli in #466
- Add CC-25M announcement to docs by @RobbeSneyders in #468
- Update release announcements by @mrchtr in #471
- Add dataset link to press release by @mrchtr in #472
- Create load from parquet by @PhilippeMoussalli in #474
- Fix caching writes by @PhilippeMoussalli in #469
- Add caching dependency by @PhilippeMoussalli in #479
- Add memory request and limit to components by @PhilippeMoussalli in #482
- Improve hit rate of download images component by @RobbeSneyders in #470
- Cast id to string laion by @PhilippeMoussalli in #485
- Bugfix partitioning by @PhilippeMoussalli in #478
- Generate READMEs for all components using a script by @RobbeSneyders in #484
- Add component hub doc page by @RobbeSneyders in #487
- explorer small fix by @Hakimovich99 in #481
- Optimize GPU components by @PhilippeMoussalli in #489
- Update Pillow to 10.0.1 to fix security issues by @RobbeSneyders in #493
- Update documentation regarding feedback by @mrchtr in #473
- Restructure-cli by @PhilippeMoussalli in #488
- Add empty requirements.txt to load_from_parquet component by @RobbeSneyders in #504
- Use s3 client instead of http to access common crawl by @mrchtr in #501
- Fix run CLI by @RobbeSneyders in #507
- Migrate to KfpV2 by @GeorgesLorre in #477
- Remove abstract component test by @mrchtr in #510
- Only keep columns in produces by @PhilippeMoussalli in #490
- Run black on components in pre-commit by @RobbeSneyders in #511
- Run bandit on components by @RobbeSneyders in #513
- Move container registry to DockerHub by @RobbeSneyders in #514
- Update component docs by @PhilippeMoussalli in #516
- Vertex cli by @PhilippeMoussalli in #519
- Refactor compile method for kfp and vertex by @PhilippeMoussalli in #522
- Modify arg default by @PhilippeMoussalli in #524
- Propagate
client_kwargs
argument and lower extract_images python version by @RobbeSneyders in #525 - Revert fsspec changes by @mrchtr in #523
- Add resource limits for Vertex by @RobbeSneyders in #529
- Update vertex and general docs by @PhilippeMoussalli in #526
- Component/generate embeddings by @tillwenke in #520
- Add fondant build command by @RobbeSneyders in #527
- Fix explorer build script for DockerHub by @RobbeSneyders in #531
- Chunker component by @PhilippeMoussalli in #528
- Update text embedding component by @PhilippeMoussalli in #532
- Add IndexWeaviate component by @tillwenke in #521
- Build command: raise errors when pushing and make tag optional by @RobbeSneyders in #533
- Update component readmes by @RobbeSneyders in #538
- Add network argument to vertex runner by @RobbeSneyders in #537
New Contributors
- @Hakimovich99 made their first contribution in #481
Full Changelog: 0.5.0...0.6.1