-
Notifications
You must be signed in to change notification settings - Fork 25
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update text embedding component (#532)
PR that modifies the text embedding component: * Change the name of the component to `embed_text` for consistency (we already have an `embed_image` component) * Added VertexAI as a possible model to use * Changed the base image to pytorch since some of the dependencies in the requirement have cuda deps * Fixed some tests (mainly path related) Tested the component with the CC rag pipeline and works fine (added the [`op`](fcf8664) here). The component runs fine (exit code 0) however there seems to be an error that pops since the connection with the API does not seem to shutdown gracefully. ``` rag-cc-pipeline-embed_text-1 | [2023-10-18 13:58:45,227 | root | INFO] Writing data... [########################################] | 100% Completed | 52.61 ss rag-cc-pipeline-embed_text-1 | [2023-10-18 13:59:37,967 | fondant.executor | INFO] Saving output manifest to gs://soy-audio-379412_kfp-artifacts/custom_artifact/rag-cc-pipeline/rag-cc-pipeline-20231018155832/embed_text/manifest.json rag-cc-pipeline-embed_text-1 | [2023-10-18 13:59:37,967 | fondant.executor | INFO] Writing cache key to gs://soy-audio-379412_kfp-artifacts/custom_artifact/rag-cc-pipeline/cache/dd3a24cb288cd0eaba12063d2885bc9d.txt rag-cc-pipeline-embed_text-1 | Traceback (most recent call last): rag-cc-pipeline-embed_text-1 | File "src/python/grpcio/grpc/_cython/_cygrpc/aio/grpc_aio.pyx.pxi", line 110, in grpc._cython.cygrpc.shutdown_grpc_aio rag-cc-pipeline-embed_text-1 | File "src/python/grpcio/grpc/_cython/_cygrpc/aio/grpc_aio.pyx.pxi", line 114, in grpc._cython.cygrpc.shutdown_grpc_aio rag-cc-pipeline-embed_text-1 | File "src/python/grpcio/grpc/_cython/_cygrpc/aio/grpc_aio.pyx.pxi", line 78, in grpc._cython.cygrpc._actual_aio_shutdown rag-cc-pipeline-embed_text-1 | AttributeError: 'NoneType' object has no attribute 'POLLER' rag-cc-pipeline-embed_text-1 | Exception ignored in: 'grpc._cython.cygrpc.AioChannel.__dealloc__' rag-cc-pipeline-embed_text-1 | Traceback (most recent call last): rag-cc-pipeline-embed_text-1 | File "src/python/grpcio/grpc/_cython/_cygrpc/aio/grpc_aio.pyx.pxi", line 110, in grpc._cython.cygrpc.shutdown_grpc_aio rag-cc-pipeline-embed_text-1 | File "src/python/grpcio/grpc/_cython/_cygrpc/aio/grpc_aio.pyx.pxi", line 114, in grpc._cython.cygrpc.shutdown_grpc_aio rag-cc-pipeline-embed_text-1 | File "src/python/grpcio/grpc/_cython/_cygrpc/aio/grpc_aio.pyx.pxi", line 78, in grpc._cython.cygrpc._actual_aio_shutdown rag-cc-pipeline-embed_text-1 | AttributeError: 'NoneType' object has no attribute 'POLLER' ``` Normally this would be done with a client but in this case we're initializing the model directly
- Loading branch information
1 parent
1dbfec2
commit 877f2c5
Showing
11 changed files
with
75 additions
and
37 deletions.
There are no files selected for viewing
18 changes: 13 additions & 5 deletions
18
components/generate_embeddings/Dockerfile → components/embed_text/Dockerfile
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,22 +1,30 @@ | ||
FROM --platform=linux/amd64 python:3.8-slim as base | ||
FROM --platform=linux/amd64 pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime as base | ||
|
||
# System dependencies | ||
RUN apt-get update && \ | ||
apt-get upgrade -y && \ | ||
apt-get install git -y | ||
|
||
# Install requirements | ||
COPY requirements.txt / | ||
COPY requirements.txt ./ | ||
RUN pip3 install --no-cache-dir -r requirements.txt | ||
|
||
# Install Fondant | ||
# This is split from other requirements to leverage caching | ||
ARG FONDANT_VERSION=main | ||
RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION} | ||
|
||
# Set the working directory to the component folder | ||
WORKDIR /component/src | ||
WORKDIR /component | ||
COPY src/ src/ | ||
ENV PYTHONPATH "${PYTHONPATH}:./src" | ||
|
||
# Copy over src-files | ||
COPY src/ . | ||
FROM base as test | ||
COPY test_requirements.txt . | ||
RUN pip3 install --no-cache-dir -r test_requirements.txt | ||
COPY tests/ tests/ | ||
RUN python -m pytest tests | ||
|
||
FROM base | ||
WORKDIR /component/src | ||
ENTRYPOINT ["fondant", "execute", "main"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1 change: 1 addition & 0 deletions
1
...ents/generate_embeddings/requirements.txt → components/embed_text/requirements.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
pytest==7.4.2 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file was deleted.
Oops, something went wrong.