Skip to content

Commit

Permalink
Add release blog for 0.8 (#723)
Browse files Browse the repository at this point in the history
Co-authored-by: Robbe Sneyders <[email protected]>
  • Loading branch information
GeorgesLorre and RobbeSneyders authored Dec 13, 2023
1 parent f7d3c03 commit f2b1b01
Show file tree
Hide file tree
Showing 7 changed files with 244 additions and 4 deletions.
Binary file added docs/art/announcements/RAG.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/art/data_explorer/explorer_document.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/art/runners/sagemaker_run.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions docs/blog/.authors.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,7 @@ authors:
name: Matthias Richter
description: ML Engineer
avatar: https://avatars.githubusercontent.com/u/15777729
GeorgesLorre:
name: Georges Lorré
description: Data Engineer
avatar: https://avatars.githubusercontent.com/u/35808396
235 changes: 235 additions & 0 deletions docs/blog/posts/2023-12-13|Fondant_0.8_Interface.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,235 @@
---
date:
created: 2023-12-13
authors:
- GeorgesLorre
- RobbeSneyders
---

# Fondant 0.8: Simplification, Sagemaker, RAG, and more!

Hi all, we released Fondant 0.8, which brings some major new features and improvements:

* 📝 We simplified and improved the way datasets are stored and accessed
* 🚀 The interface to compose a Fondant pipeline is now simpler and more powerful
* 🌐 AWS SageMaker is now supported as an execution framework for Fondant pipelines
* 🔍 The Fondant explorer was improved, especially for text and document data
* 📚 We released a RAG tuning repository powered by Fondant

Read on for more details!

<!-- more -->

## 📝 We simplified and improved the way datasets are stored and accessed

We listened to all your feedback and drastically simplified Fondant datasets, while solving some
longstanding issues as part of the design.

Most important for you is that we flattened the datasets, removing the concept of `subsets` from
Fondant. Which means you can now access the data fields directly!

<table>
<tr>
<th>Previous</th>
<th>New ✨</th>
</tr>
<tr>
<td>

```yaml title="fondant_component.yaml"
consumes:
images:
fields:
height:
type: int32
width:
type: int32
```
</td>
<td>
```yaml title="fondant_component.yaml"
consumes:
height:
type: int32
width:
type: int32
```
</td>
</tr>
<tr>
<td>
```python title="src/main.py"
import pandas as pd
from fondant.component import PandasTransformComponent


class ExampleComponent(PandasTransformComponent):
This will be available in a future release.
def transform(self, dataframe: pd.DataFrame):
height = dataframe["images"]["height"]
width = dataframe["images"]["width"]
...
```
</td>
<td>
```python title="src/main.py"
import pandas as pd
from fondant.component import PandasTransformComponent


class ExampleComponent(PandasTransformComponent):

def transform(self, dataframe: pd.DataFrame):
height = dataframe["height"]
width = dataframe["width"]
...

```

</td>
</tr>
</table>

## 🚀 The interface to compose a Fondant pipeline is now simpler and more powerful.

You can now chain components together using the `read()`, `apply()` and `write` methods, removing
the need for specifying dependencies separately, making composing pipelines a breeze.

<table>
<tr>
<th>Previous</th>
<th>New ✨</th>
</tr>
<tr>
<td>

```python title="pipeline.py"
from fondant.pipeline import Pipeline, component_op

pipeline = Pipeline(
pipeline_name="my-pipeline",
base_path="./data",
)

load_from_hf_hub = ComponentOp(
name="load_from_hf_hub",
arguments={
"dataset_name": "fondant-ai/fondant-cc-25m",
},
)

download_images = ComponentOp.from_registry(
name="download_images",
arguments= {"resize_mode": "no"},
)

pipeline.add_op(load_from_hf_hub)
pipeline.add_op(
download_images,
dependencies=[load_from_hf_hub]
)

```

</td>
<td width="50%">

```python title="pipeline.py"
import pyarrow as pa
from fondant.pipeline import Pipeline

pipeline = Pipeline(
name="my-pipeline",
base_path="./data",
)

raw_data = pipeline.read(
"load_from_hf_hub",
arguments={
"dataset_name": "fondant-ai/fondant-cc-25m",
},
produces={
"alt_text": pa.string(),
"image_url": pa.string(),
"license_type": pa.string(),
},
)

images = raw_data.apply(
"download_images",
arguments={"resize_mode": "no"},
)
```

</td>
</tr>
</table>

Some of the benefits of this new interface are:

- Support for overriding the produces and consumes of a component, allowing you to easily change the output of a component without having to create a custom `fondant_component.yaml` file.
- We unlock the future ability to enable eager execution of components and interactive
development of pipelines. Keep an eye on our next releases!

If you want to know more or get started you can check out the [documentation](https://fondant.ai/en/latest/pipeline/)

## 🌐 AWS SageMaker is now supported as an execution framework for Fondant pipelines.

You can now easily run your Fondant pipelines on AWS SageMaker using the `fondant run sagemaker <pipeline.py>` command. Run `fondant run sagemaker --help` to see the possible configuration options or check out the [documentation](https://fondant.ai/en/latest/runners/sagemaker/).

![Sagemaker pipeline](../../art/runners/sagemaker_run.png)


## 🔍Fondant explorer improvements

We added a lot of improvements to the Fondant explorer, including:

- A pipeline overview showing the data flow through the pipeline
- A document viewer to inspect data (handy for RAG use cases)
- Better filtering, sorting and searching of data while exploring

![General overview data explorer](../../art/data_explorer/general_overview.png)
![Document view data explorer](../../art/data_explorer/explorer_document.png)

To get started with the Fondant explorer, check out the [documentation](https://fondant.ai/en/latest/data_explorer/).


## 📚 We released a RAG tuning repository powered by Fondant

This repository helps you tune your RAG system faster and achieve better performance using
Fondant. Find the repository including a full explanation [here](https://github.com/ml6team/fondant-usecase-RAG).

![RAG tuning](../../art/announcements/RAG.png)

It includes:

- A Fondant pipeline to ingest the data
- A Fondant pipeline to evaluate the data
- Multiple notebooks to go from a basic RAG pipeline to fully auto-tuned RAG pipelines

## 🔧 New reusable RAG components

A lot of new reusable components were added to the Fondant registry, letting you build new RAG
pipelines quickly!

- Weaviate [indexing](https://github.com/ml6team/fondant/tree/main/components/index_weaviate) and [retrieval](https://github.com/ml6team/fondant/tree/main/components/retrieve_from_weaviate) components
- Qdrant [indexing](https://github.com/ml6team/fondant/blob/main/components/index_qdrant/README.md)
- Ragas [evaluation](https://github.com/ml6team/fondant/blob/main/components/evaluate_ragas/README.md)
- LlamaHub [loading](https://github.com/ml6team/fondant/tree/main/components/load_with_llamahub)
- LangChain [chunking](https://github.com/ml6team/fondant/tree/main/components/chunk_text) and
[embedding](https://github.com/ml6team/fondant/tree/main/components/embed_text)

You can see some of these components in action in the [RAG tuning repository](https://github.
com/ml6team/fondant-usecase-RAG).

## 🛠 Install it now!

```bash
pip install fondant==0.8.0
```

And let us know what you think!
5 changes: 3 additions & 2 deletions docs/overrides/main.html
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,9 @@

{% block announce %}
<p style="text-align: center">
🌀 You can now run your Fondant pipelines on Vertex AI!
<a href="https://fondant.ai/en/latest/blog/2023/10/20/fondant-06-brings-vertex-ai-support-and-more/"
Fondant 0.8 is out: Simplification, Sagemaker, RAG, and more!

<a href="https://fondant.ai/en/latest/blog/2023/12/13/fondant-08-simplification-sagemaker-rag-and-more/"
style="color: white; text-decoration: underline">Read more</a>
</p>
{% endblock %}
4 changes: 2 additions & 2 deletions docs/runners/vertex.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ info [here](https://codelabs.developers.google.com/vertex-pipelines-intro#2)
```bash
fondant run vertex <pipeline_ref> \
--project-id $PROJECT_ID \
--project-region $PROJECT_REGION \
--region $PROJECT_REGION \
--service-account $SERVICE_ACCOUNT
```

Expand All @@ -52,7 +52,7 @@ info [here](https://codelabs.developers.google.com/vertex-pipelines-intro#2)

runner = VertexRunner(
project_id=project_id,
project_region=project_region,
region=project_region,
service_account=service_account)
)
runner.run(input_spec=<path_to_compiled_spec>)
Expand Down

0 comments on commit f2b1b01

Please sign in to comment.