Skip to content

Commit

Permalink
Implement the dagster-openai integration library (#19697)
Browse files Browse the repository at this point in the history
## Summary & Motivation

This PR adds a new `dagster-openai` library to our set of libraries.

The main goal of this library is to log the Open AI API usage in the
metadata. To do so, we need to wrap the methods called through the
client, get the results and update the metadata.

Initial code snippets was hardcoding 3 methods, but we want to give the
user some flexibility.

Constraints:
- Results must be captured at the method level - the data we seek is
included in the OpenAI API response. The results can't be captured at
the client level, at teardown for instance.
- Not all the methods existing in the OpenAI library should be wrap
(private methods, etc.)
- Methods are overloaded in the API Resource classes, so wrapping the
methods should be done on the instance.

**Solution**

Implement `OpenAIResource.get_client`,
`OpenAIResource.get_client_for_asset` and the function wrapper
`with_usage_metadata`.

By default, for assets, the methods for the 3 main API Endpoint classes,
`Completions`, `Chat` and `Embeddings`, are wrapped when instantiating
the client - wrapping the methods allows to log the usage metadata
provided in an OpenAI Completion response. If another endpoint should be
wrapped, a user can use `with_usage_metadata` to it and log the
metadata.

`OpenAIResource.get_client` can be used for assets and ops, but the
metadata will not be logged for ops.
`OpenAIResource.get_client_for_asset` can only be used with assets and
the metadata will be logged.

## TO-DOs
- [x] implement the resource
- [x] add docstrings
- [x] implement tests

## How I Tested These Changes
Local implementation
BK
Dogfood in Purina with a toy example
  • Loading branch information
maximearmstrong authored and PedramNavid committed Feb 28, 2024
1 parent 1c96516 commit 8cdc68f
Show file tree
Hide file tree
Showing 15 changed files with 1,175 additions and 44 deletions.
20 changes: 9 additions & 11 deletions pyright/alt-1/requirements-pinned.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,8 @@ aiosignal==1.3.1
alembic==1.13.1
aniso8601==9.0.1
annotated-types==0.6.0
anyio==4.2.0
anyio==4.3.0
appnope==0.1.4
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
Expand Down Expand Up @@ -36,7 +37,7 @@ colored==1.4.4
coloredlogs==14.0
comm==0.2.1
contourpy==1.2.0
coverage==7.4.1
coverage==7.4.2
croniter==2.0.1
cryptography==41.0.7
cycler==0.12.1
Expand Down Expand Up @@ -102,7 +103,6 @@ gql==3.5.0
graphene==3.3
graphql-core==3.2.3
graphql-relay==3.2.0
greenlet==3.0.3
grpcio==1.60.1
grpcio-health-checking==1.60.1
grpcio-tools==1.60.1
Expand All @@ -112,7 +112,7 @@ httplib2==0.22.0
httptools==0.6.1
httpx==0.26.0
humanfriendly==10.0
hypothesis==6.98.8
hypothesis==6.98.9
idna==3.6
importlib-metadata==6.11.0
iniconfig==2.0.0
Expand All @@ -123,11 +123,10 @@ isoduration==20.11.0
isort==5.13.2
jaraco.classes==3.3.1
jedi==0.19.1
jeepney==0.8.0
Jinja2==3.1.3
jmespath==1.0.1
joblib==1.3.2
json5==0.9.14
json5==0.9.17
jsonpointer==2.4
jsonschema==4.21.1
jsonschema-specifications==2023.12.1
Expand All @@ -137,7 +136,7 @@ jupyter_client==8.6.0
jupyter_core==5.7.1
jupyter_server==2.12.5
jupyter_server_terminals==0.5.2
jupyterlab==4.1.1
jupyterlab==4.1.2
jupyterlab_pygments==0.3.0
jupyterlab_server==2.25.3
keyring==24.3.0
Expand All @@ -160,12 +159,12 @@ more-itertools==10.2.0
morefs==0.2.0
msgpack==1.0.7
multidict==6.0.5
multimethod==1.11
multimethod==1.11.1
mypy==1.8.0
mypy-extensions==1.0.0
mypy-protobuf==3.5.0
nbclient==0.9.0
nbconvert==7.16.0
nbconvert==7.16.1
nbformat==5.9.2
nest-asyncio==1.6.0
networkx==3.2.1
Expand All @@ -191,7 +190,7 @@ pexpect==4.9.0
pillow==10.2.0
platformdirs==3.11.0
pluggy==1.4.0
polars==0.20.9
polars==0.20.10
-e examples/project_fully_featured
prometheus_client==0.20.0
prompt-toolkit==3.0.43
Expand Down Expand Up @@ -247,7 +246,6 @@ s3transfer==0.10.0
scikit-learn==1.4.1.post1
scipy==1.12.0
seaborn==0.13.2
SecretStorage==3.3.3
Send2Trash==1.8.2
six==1.16.0
slack_sdk==3.27.0
Expand Down
56 changes: 23 additions & 33 deletions pyright/master/requirements-pinned.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,10 @@ alembic==1.13.1
altair==4.2.2
amqp==5.2.0
aniso8601==9.0.1
anyio==4.2.0
anyio==4.3.0
apache-airflow==2.7.3
apache-airflow-providers-apache-spark==4.7.1
apache-airflow-providers-cncf-kubernetes==7.14.0
apache-airflow-providers-cncf-kubernetes==8.0.0
apache-airflow-providers-common-sql==1.11.0
apache-airflow-providers-docker==3.9.1
apache-airflow-providers-ftp==3.7.0
Expand All @@ -24,6 +24,7 @@ apeye==1.4.1
apeye-core==1.1.5
apispec==6.4.0
appdirs==1.4.4
appnope==0.1.4
argcomplete==3.2.2
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
Expand All @@ -37,7 +38,7 @@ async-lru==2.0.4
async-timeout==4.0.3
attrs==23.2.0
autodocsumm==0.2.12
autoflake==2.2.1
autoflake==2.3.0
-e python_modules/automation
avro==1.11.3
avro-gen3==0.7.11
Expand All @@ -56,8 +57,8 @@ bitmath==1.3.3.1
bleach==6.1.0
blinker==1.7.0
bokeh==3.3.4
boto3==1.34.44
botocore==1.34.44
boto3==1.34.46
botocore==1.34.46
buildkite-test-collector==0.1.7
CacheControl==0.14.0
cached-property==1.5.2
Expand Down Expand Up @@ -89,7 +90,7 @@ ConfigUpdater==3.2
confluent-kafka==2.3.0
connexion==2.14.2
contourpy==1.2.0
coverage==7.4.1
coverage==7.4.2
cron-descriptor==1.4.3
croniter==2.0.1
cryptography==41.0.7
Expand Down Expand Up @@ -133,6 +134,7 @@ cycler==0.12.1
-e python_modules/libraries/dagster-mlflow
-e python_modules/libraries/dagster-msteams
-e python_modules/libraries/dagster-mysql
-e python_modules/libraries/dagster-openai
-e python_modules/libraries/dagster-pagerduty
-e python_modules/libraries/dagster-pandas
-e python_modules/libraries/dagster-pandera
Expand Down Expand Up @@ -182,6 +184,7 @@ diff-match-patch==20200713
dill==0.3.8
distlib==0.3.8
distributed==2024.2.0
distro==1.9.0
dnspython==2.6.1
docker==5.0.3
docker-image-py==0.1.12
Expand Down Expand Up @@ -240,7 +243,6 @@ graphql-core==3.2.3
graphql-relay==3.2.0
graphviz==0.20.1
great-expectations==0.17.11
greenlet==3.0.3
grpcio==1.60.1
grpcio-health-checking==1.60.1
grpcio-status==1.60.1
Expand All @@ -254,7 +256,7 @@ httplib2==0.22.0
httptools==0.6.1
httpx==0.26.0
humanfriendly==10.0
hypothesis==6.98.8
hypothesis==6.98.9
idna==3.6
ijson==3.2.3
imagesize==1.4.1
Expand All @@ -275,10 +277,10 @@ Jinja2==3.1.2
jmespath==1.0.1
joblib==1.3.2
jschema-to-python==1.2.3
json5==0.9.14
json5==0.9.17
jsondiff==2.0.0
jsonpatch==1.33
jsonpickle==3.0.2
jsonpickle==3.0.3
jsonpointer==2.4
jsonref==1.1.0
jsonschema==4.21.1
Expand All @@ -291,16 +293,16 @@ jupyter_client==7.4.9
jupyter_core==5.7.1
jupyter_server==2.12.5
jupyter_server_terminals==0.5.2
jupyterlab==4.1.1
jupyterlab==4.1.2
jupyterlab_pygments==0.3.0
jupyterlab_server==2.25.3
jupyterlab_widgets==3.0.10
jwt==1.3.1
kiwisolver==1.4.5
kombu==5.3.5
kopf==1.37.1
kubernetes==23.6.0
kubernetes-asyncio==24.2.3
kubernetes==29.0.0
kubernetes_asyncio==29.0.0
lazy-object-proxy==1.10.0
leather==0.3.4
limits==3.9.0
Expand Down Expand Up @@ -335,13 +337,13 @@ msal==1.26.0
msal-extensions==1.1.0
msgpack==1.0.7
multidict==6.0.5
multimethod==1.11
multimethod==1.11.1
mypy-extensions==1.0.0
mypy-protobuf==3.5.0
mysql-connector-python==8.3.0
natsort==8.4.0
nbclient==0.9.0
nbconvert==7.16.0
nbconvert==7.16.1
nbformat==5.9.2
nest-asyncio==1.6.0
networkx==2.8.8
Expand All @@ -351,24 +353,13 @@ noteable-origami==0.0.35
notebook==7.1.0
notebook_shim==0.2.4
numpy==1.26.4
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.19.3
nvidia-nvjitlink-cu12==12.3.101
nvidia-nvtx-cu12==12.1.105
oauth2client==4.1.3
oauthlib==3.2.2
objgraph==3.6.0
onnx==1.15.0
onnxconverter-common==1.13.0
onnxruntime==1.17.0
openai==1.12.0
openapi-schema-validator==0.6.2
openapi-spec-validator==0.7.1
opentelemetry-api==1.22.0
Expand Down Expand Up @@ -408,7 +399,7 @@ pkginfo==1.9.6
platformdirs==3.11.0
plotly==5.19.0
pluggy==1.4.0
polars==0.20.9
polars==0.20.10
portalocker==2.8.2
prison==0.2.1
progressbar2==4.3.2
Expand Down Expand Up @@ -486,17 +477,17 @@ scikit-learn==1.4.1.post1
scipy==1.12.0
scrapbook==0.5.0
seaborn==0.13.2
selenium==4.17.2
selenium==4.18.1
Send2Trash==1.8.2
sending==0.3.0
sentry-sdk==1.40.4
sentry-sdk==1.40.5
setproctitle==1.3.3
six==1.16.0
skein==0.8.2
skl2onnx==1.16.0
slack_sdk==3.27.0
sling==1.1.5.post4
sling-linux-amd64==1.1.5.post4
sling==1.1.6.post1
sling-mac-universal==1.1.6.post1
smmap==5.0.1
sniffio==1.3.0
snowballstemmer==2.2.0
Expand Down Expand Up @@ -549,7 +540,6 @@ tqdm==4.66.2
traitlets==5.14.1
trio==0.24.0
trio-websocket==0.11.1
triton==2.2.0
-e examples/tutorial_notebook_assets
twilio==8.13.0
twine==1.15.0
Expand Down
1 change: 1 addition & 0 deletions pyright/master/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,7 @@
-e python_modules/libraries/dagster-mlflow/
-e python_modules/libraries/dagster-msteams/
-e python_modules/libraries/dagster-mysql/
-e python_modules/libraries/dagster-openai/
-e python_modules/libraries/dagster-pagerduty/
-e python_modules/libraries/dagster-pandas/
-e python_modules/libraries/dagster-pandera/
Expand Down
2 changes: 2 additions & 0 deletions python_modules/libraries/dagster-openai/.coveragerc
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[run]
branch = True
Loading

0 comments on commit 8cdc68f

Please sign in to comment.