Skip to content

Commit

Permalink
Add dask and docker as default dependencies (#893)
Browse files Browse the repository at this point in the history
Fixes:
- #891

 Fixes (maybe needs more work):
- #877
  • Loading branch information
GeorgesLorre authored Mar 6, 2024
1 parent 1ff9b1a commit f2da61b
Show file tree
Hide file tree
Showing 32 changed files with 43 additions and 76 deletions.
7 changes: 1 addition & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,12 +114,7 @@ pip install fondant
```

Fondant also includes extra dependencies for specific runners, storage integrations and publishing
components to registries.
We can install the local runner to enable local pipeline execution:

```
pip install fondant[docker]
```
components to registries. The dependencies for the local runner (docker) is included by default.

For more detailed installation options, check the [**installation page**](https://fondant.ai/en/latest/guides/installation/)on our documentation.

Expand Down
8 changes: 1 addition & 7 deletions docs/guides/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,13 +51,7 @@ Fondant also includes extra dependencies for specific runners, storage integrati

### Publishing components dependencies

For publishing components to registries:

```bash
pip install fondant[docker]
```

Check out the [guide](../components/publishing_components.md) on publishing components to registries.
For publishing components to registries check out the [guide](../components/publishing_components.md) on publishing components to registries.

### Runner specific dependencies

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ RUN pip3 install --no-cache-dir -r requirements.txt

# Install fondant
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component/src
Expand Down
2 changes: 1 addition & 1 deletion images/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,5 +8,5 @@ RUN apt-get update && \

# Install Fondant
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

13 changes: 6 additions & 7 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -49,33 +49,32 @@ jsonschema = ">= 4.18"
pyarrow = ">= 11.0.0"
pyyaml = ">= 5.3.1"

dask = { version = ">= 2023.4.1", extras = ["dataframe", "distributed", "diagnostics"], optional = true }
dask = { version = ">= 2023.4.1", extras = ["dataframe", "distributed", "diagnostics"]}
docker = ">= 6.1.3"

dask-cuda = { version = ">=23.4.1", optional = true }

gcsfs = { version = ">= 2023.10.0", optional = true }
s3fs = { version = ">= 2023.4.0", optional = true }
adlfs = { version = ">= 2023.4.0", optional = true }

docker = {version = ">= 6.1.3", optional = true }
kfp = { version = "2.6.0", optional = true, extras =["kubernetes"] }
google-cloud-aiplatform = { version = "1.34.0", optional = true}
sagemaker = {version = ">= 2.197.0", optional = true}
boto3 = {version = "1.28.64", optional = true}

[tool.poetry.extras]
component = ["dask"]
gpu = ["dask-cuda"]

aws = ["s3fs"]
azure = ["adlfs"]
gcp = ["gcsfs"]

kfp = ["docker", "kfp"]
vertex = ["docker", "kfp", "google-cloud-aiplatform"]
kfp = ["kfp"]
vertex = ["kfp", "google-cloud-aiplatform"]
sagemaker = ["sagemaker", "boto3"]
docker = ["docker"]

all = ["dask", "dask-cuda", "s3fs", "adlfs", "gcsfs", "docker", "kfp", "google-cloud-aiplatform",
all = ["dask-cuda", "s3fs", "adlfs", "gcsfs", "kfp", "google-cloud-aiplatform",
"sagemaker", "boto3"]

[tool.poetry.group.test]
Expand Down
14 changes: 3 additions & 11 deletions src/fondant/build.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,14 @@
"""Module holding implementation to build Fondant components, used by the `fondant build`
command.
"""

import logging
import re
import typing as t
from pathlib import Path

import docker

from fondant.pipeline import ComponentOp

logger = logging.getLogger(__name__)
Expand All @@ -22,17 +25,6 @@ def build_component( # ruff: noqa: PLR0912, PLR0915
pull: bool = False,
target: t.Optional[str] = None,
) -> None:
try:
import docker
except ImportError:
msg = (
"You need to install `docker` to use the `fondant build` command, you can install "
"it with `pip install fondant[docker]`"
)
raise SystemExit(
msg,
)

component_op = ComponentOp.from_component_yaml(component_dir)
component_spec = component_op.component_spec

Expand Down
23 changes: 5 additions & 18 deletions src/fondant/component/__init__.py
Original file line number Diff line number Diff line change
@@ -1,19 +1,6 @@
try:
pass
except ImportError:
msg = (
"You need to install fondant using the `component` extra to develop or run a component."
"You can install it with `pip install fondant[component]`"
)
raise SystemExit(
msg,
)
# fmt: off
from .component import (BaseComponent, Component, DaskLoadComponent, # noqa
DaskTransformComponent, DaskWriteComponent,
PandasTransformComponent)

from .component import ( # noqa
BaseComponent,
Component,
DaskLoadComponent,
DaskTransformComponent,
DaskWriteComponent,
PandasTransformComponent,
)
# fmt: on
2 changes: 1 addition & 1 deletion src/fondant/components/caption_images/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN pip3 install --no-cache-dir -r requirements.txt
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component
Expand Down
2 changes: 1 addition & 1 deletion src/fondant/components/chunk_text/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN pip3 install --no-cache-dir -r requirements.txt
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component
Expand Down
2 changes: 1 addition & 1 deletion src/fondant/components/crop_images/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN pip3 install --no-cache-dir -r requirements.txt
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component/src
Expand Down
2 changes: 1 addition & 1 deletion src/fondant/components/download_images/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN pip3 install --no-cache-dir -r requirements.txt
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component
Expand Down
2 changes: 1 addition & 1 deletion src/fondant/components/embed_images/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN pip3 install --no-cache-dir -r requirements.txt
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,gpu,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[gpu,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component/src
Expand Down
2 changes: 1 addition & 1 deletion src/fondant/components/embed_text/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN pip3 install --no-cache-dir -r requirements.txt
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component
Expand Down
2 changes: 1 addition & 1 deletion src/fondant/components/extract_image_resolution/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN pip3 install --no-cache-dir -r requirements.txt
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component/src
Expand Down
2 changes: 1 addition & 1 deletion src/fondant/components/filter_image_resolution/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN pip3 install --no-cache-dir -r requirements.txt
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component/src
Expand Down
2 changes: 1 addition & 1 deletion src/fondant/components/filter_language/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN pip3 install --no-cache-dir -r requirements.txt
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component
Expand Down
2 changes: 1 addition & 1 deletion src/fondant/components/filter_text_length/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN pip3 install --no-cache-dir -r requirements.txt
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component
Expand Down
2 changes: 1 addition & 1 deletion src/fondant/components/generate_minhash/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN pip3 install --no-cache-dir -r requirements.txt
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component
Expand Down
2 changes: 1 addition & 1 deletion src/fondant/components/index_aws_opensearch/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN pip3 install --no-cache-dir -r requirements.txt
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component
Expand Down
2 changes: 1 addition & 1 deletion src/fondant/components/index_qdrant/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN pip3 install --no-cache-dir -r requirements.txt
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component
Expand Down
2 changes: 1 addition & 1 deletion src/fondant/components/index_weaviate/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN pip3 install --no-cache-dir -r requirements.txt
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component
Expand Down
2 changes: 1 addition & 1 deletion src/fondant/components/load_from_csv/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ RUN apt-get update && \
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component
Expand Down
2 changes: 1 addition & 1 deletion src/fondant/components/load_from_files/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN pip3 install --no-cache-dir -r requirements.txt
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component
Expand Down
2 changes: 1 addition & 1 deletion src/fondant/components/load_from_hf_hub/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN pip3 install --no-cache-dir -r requirements.txt
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component/src
Expand Down
2 changes: 1 addition & 1 deletion src/fondant/components/load_from_parquet/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN pip3 install --no-cache-dir -r requirements.txt
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component/src
Expand Down
2 changes: 1 addition & 1 deletion src/fondant/components/load_from_pdf/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN pip3 install --no-cache-dir -r requirements.txt
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component
Expand Down
2 changes: 1 addition & 1 deletion src/fondant/components/resize_images/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN pip3 install --no-cache-dir -r requirements.txt
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component/src
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN pip3 install --no-cache-dir -r requirements.txt
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component
Expand Down
2 changes: 1 addition & 1 deletion src/fondant/components/retrieve_laion_by_prompt/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN pip3 install --no-cache-dir -r requirements.txt
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component
Expand Down
2 changes: 1 addition & 1 deletion src/fondant/components/segment_images/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN pip3 install --no-cache-dir -r requirements.txt
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component/src
Expand Down
2 changes: 1 addition & 1 deletion src/fondant/components/write_to_file/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN pip3 install --no-cache-dir -r requirements.txt
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component
Expand Down
2 changes: 1 addition & 1 deletion src/fondant/components/write_to_hf_hub/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN pip3 install --no-cache-dir -r requirements.txt
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Set the working directory to the component folder
WORKDIR /component/src
Expand Down

0 comments on commit f2da61b

Please sign in to comment.