Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(llmobs): submit langchain embedding spans #9850

Merged
merged 78 commits into from
Jul 31, 2024

Conversation

yahya-mouman
Copy link
Contributor

@yahya-mouman yahya-mouman commented Jul 16, 2024

Changes

This PR adds instrumentation to submit langchain embedding spans to LLM Observability. The embedding spans sent to LLM Observability will contain the following I/O data:

input.documents: when part of a workflow captures the embeddings as a json list. when it is not a workflow but a single embedding operation captures each embedding input as a Document dictionary containing a text field
output.value: instead of unnecessarily storing very long vector values, adds a placeholder text [X embeddings returned with size Y]

Ticket : https://datadoghq.atlassian.net/browse/MLOB-937

Checklist

  • PR author has checked that all the criteria below are met
  • The PR description includes an overview of the change
  • The PR description articulates the motivation for the change
  • The change includes tests OR the PR description describes a testing strategy
  • The PR description notes risks associated with the change, if any
  • Newly-added code is easy to change
  • The change follows the library release note guidelines
  • The change includes or references documentation updates if necessary
  • Backport labels are set (if applicable)

Reviewer Checklist

  • Reviewer has checked that all the criteria below are met
  • Title is accurate
  • All changes are related to the pull request's stated goal
  • Avoids breaking API changes
  • Testing strategy adequately addresses listed risks
  • Newly-added code is easy to change
  • Release note makes sense to a user of the library
  • If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
  • Backport labels are set in a manner that is consistent with the release branch maintenance policy

Copy link
Contributor

github-actions bot commented Jul 16, 2024

CODEOWNERS have been resolved as:

releasenotes/notes/feat-llmobs-submit-langchain-embedding-spans-89c8704ef41cfee3.yaml  @DataDog/apm-python
tests/contrib/langchain/cassettes/langchain_community/openai_embedding_query_integration.yaml  @DataDog/ml-observability
ddtrace/contrib/langchain/patch.py                                      @DataDog/ml-observability
ddtrace/llmobs/_integrations/langchain.py                               @DataDog/ml-observability
tests/contrib/langchain/test_langchain_llmobs.py                        @DataDog/ml-observability

@yahya-mouman yahya-mouman changed the title [MLOB-937] Add LLM Obs support for embedding feature of langchain Chore (LangChain) : Add LLM Obs support for embedding feature of langchain Jul 16, 2024
@yahya-mouman yahya-mouman changed the title Chore (LangChain) : Add LLM Obs support for embedding feature of langchain chore(LangChain): Add LLM Obs support for embedding feature of langchain Jul 16, 2024
@yahya-mouman yahya-mouman changed the title chore(LangChain): Add LLM Obs support for embedding feature of langchain chore(langchain): Add LLM Obs support for embedding feature of langchain Jul 16, 2024
@Yun-Kim Yun-Kim changed the title chore(langchain): Add LLM Obs support for embedding feature of langchain feat(llmobs): submit langchain embedding spans Jul 17, 2024
Copy link
Contributor

@Yun-Kim Yun-Kim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation looks great, just a couple questions 😄 can we add tests in test_langchain_llmobs.py to class TestLLMObsLangchainCommunity and class TestLangchainTraceStructureWithLlmIntegrations?

ddtrace/contrib/langchain/patch.py Outdated Show resolved Hide resolved
ddtrace/llmobs/_integrations/langchain.py Outdated Show resolved Hide resolved
ddtrace/llmobs/_integrations/langchain.py Outdated Show resolved Hide resolved
ddtrace/llmobs/_integrations/langchain.py Outdated Show resolved Hide resolved
ddtrace/llmobs/_integrations/langchain.py Outdated Show resolved Hide resolved
@datadog-datadog-prod-us1
Copy link
Contributor

Library Vulnerabilities

Critical badge  High badge  Medium badge

@yahya-mouman
Copy link
Contributor Author

yahya-mouman commented Jul 18, 2024

Implementation looks great, just a couple questions 😄 can we add tests in test_langchain_llmobs.py to class TestLLMObsLangchainCommunity and class TestLangchainTraceStructureWithLlmIntegrations?

I updated and addressed all the previous comments. The only thing left now is writing the test case but before doing that, I wanted to confirm the implementation details especially the parsing difference between the workflow vs non-workflow cases 🙏

@pr-commenter
Copy link

pr-commenter bot commented Jul 18, 2024

Benchmarks

Benchmark execution time: 2024-07-31 16:49:43

Comparing candidate commit 1601844 in PR branch yahya/add-langchain-embedding-support with baseline commit 1710fe3 in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 214 metrics, 2 unstable metrics.

Copy link
Contributor

@Yun-Kim Yun-Kim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly great, I added a comment/suggestion on the input documents formatting.
Just needs tests, and release note similar to this PR.

ddtrace/llmobs/_integrations/langchain.py Show resolved Hide resolved
ddtrace/llmobs/_integrations/langchain.py Outdated Show resolved Hide resolved
@Yun-Kim Yun-Kim force-pushed the yahya/add-langchain-embedding-support branch from 6b6d966 to edffa8b Compare July 26, 2024 17:01
@Yun-Kim Yun-Kim force-pushed the yahya/add-langchain-embedding-support branch 2 times, most recently from fa1c650 to 8b22c4a Compare July 27, 2024 01:44
@Yun-Kim Yun-Kim force-pushed the yahya/add-langchain-embedding-support branch from 8b22c4a to af016f4 Compare July 27, 2024 01:54
@Yun-Kim
Copy link
Contributor

Yun-Kim commented Jul 27, 2024

@yahya-mouman there are significant flakiness/skipping issues in the Langchain test suite. Let's merge this PR once #9768 is merged.

Copy link
Contributor

@Yun-Kim Yun-Kim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be good to go after the testing comments are addressed now that the flakiness should be addressed!

tests/contrib/langchain/test_langchain_llmobs.py Outdated Show resolved Hide resolved
tests/contrib/langchain/test_langchain_llmobs.py Outdated Show resolved Hide resolved
tests/contrib/langchain/test_langchain_llmobs.py Outdated Show resolved Hide resolved
tests/contrib/langchain/test_langchain_llmobs.py Outdated Show resolved Hide resolved
tests/contrib/langchain/test_langchain_llmobs.py Outdated Show resolved Hide resolved
@Yun-Kim Yun-Kim force-pushed the yahya/add-langchain-embedding-support branch from 78f08e0 to c714c41 Compare July 30, 2024 18:56
@yahya-mouman
Copy link
Contributor Author

Should be good to go after the testing comments are addressed now that the flakiness should be addressed!

Great job on fixing the flakiness of the tests 🥇 I believe everything is done and good to go with this PR

@Yun-Kim Yun-Kim enabled auto-merge (squash) July 31, 2024 16:27
@Yun-Kim Yun-Kim merged commit e195512 into main Jul 31, 2024
100 of 107 checks passed
@Yun-Kim Yun-Kim deleted the yahya/add-langchain-embedding-support branch July 31, 2024 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants