Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OLS-39: Add streaming_query endpoint #2014

Merged
merged 3 commits into from
Jan 14, 2025

Conversation

onmete
Copy link
Contributor

@onmete onmete commented Dec 5, 2024

Description

Add streaming_query endpoint

Type of change

  • New feature

TODO

  • decide if this is how we want to implement it
  • additional minor code org
  • docs/schema
  • fix tests

Preview

Non-streaming

curl -X POST http://localhost:8080/v1/query -H "Content-Type: application/json" -d '{"query": "write a short poem about running pods in openshift?"}'

{"conversation_id":"6adc636b-22e3-45cc-8f01-f73fdc2490bc","response":"In OpenShift's realm, where pods take flight,  \nContainers unite, in harmony bright.  \nWith labels and volumes, they dance and they play,  \nOn nodes they reside, come night or come day.  \n\nImmutable spirits, they rise and they fall,  \nManaged by controllers, they heed the call.  \nFrom dev to production, they scale with great ease,  \nIn this orchestration, they flow like the breeze.  \n\nSo here's to the pods, in their vibrant array,  \nIn OpenShift's embrace, they thrive and they sway.  \nWith each deployment, a story unfolds,  \nIn the world of containers, their magic beholds.","referenced_documents":[{"docs_url":"https://docs.openshift.com/container-platform/4.15/nodes/pods/nodes-pods-using.html","title":"Using pods"},{"docs_url":"https://docs.openshift.com/container-platform/4.15/rest_api/workloads_apis/pod-v1.html","title":"Pod [v1]"},{"docs_url":"https://docs.openshift.com/container-platform/4.15/nodes/index.html","title":"Overview of nodes"}],"truncated":false}

Streaming

curl -X POST http://localhost:8080/v1/streaming_query -H "Content-Type: application/json" -d '{"query": "write a short poem about running pods in openshift?"}'

In OpenShift's realm, where pods take flight,
Containers unite, in the day and the night.
With labels and volumes, they dance in a line,
Scaling up swiftly, like stars that align.

Immutable they stand, in their defined space,
Resilient and strong, they embrace every race.
From nodes they emerge, with IPs to claim,
In the orchestration of Kubernetes, they play the game.

So here's to the pods, in their vibrant array,
In OpenShift's embrace, they thrive every day.
With each deployment, a story unfolds,
In the world of containers, their magic beholds.

---

Using pods: https://docs.openshift.com/container-platform/4.15/nodes/pods/nodes-pods-using.html
Pod [v1]: https://docs.openshift.com/container-platform/4.15/rest_api/workloads_apis/pod-v1.html
Overview of nodes: https://docs.openshift.com/container-platform/4.15/nodes/index.html

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Dec 5, 2024
@onmete onmete marked this pull request as draft December 5, 2024 10:52
@openshift-ci-robot
Copy link

openshift-ci-robot commented Dec 5, 2024

@onmete: This pull request references OLS-39 which is a valid jira issue.

In response to this:

Description

Add streaming_query endpoint

Type of change

  • New feature

TODO

  • decide if this is how we want to implement it
  • fix tests

Preview

Non-streaming

curl -X POST http://localhost:8080/v1/query -H "Content-Type: application/json" -d '{"query": "write a short poem about running pods in openshift?"}'

{"conversation_id":"6adc636b-22e3-45cc-8f01-f73fdc2490bc","response":"In OpenShift's realm, where pods take flight,  \nContainers unite, in harmony bright.  \nWith labels and volumes, they dance and they play,  \nOn nodes they reside, come night or come day.  \n\nImmutable spirits, they rise and they fall,  \nManaged by controllers, they heed the call.  \nFrom dev to production, they scale with great ease,  \nIn this orchestration, they flow like the breeze.  \n\nSo here's to the pods, in their vibrant array,  \nIn OpenShift's embrace, they thrive and they sway.  \nWith each deployment, a story unfolds,  \nIn the world of containers, their magic beholds.","referenced_documents":[{"docs_url":"https://docs.openshift.com/container-platform/4.15/nodes/pods/nodes-pods-using.html","title":"Using pods"},{"docs_url":"https://docs.openshift.com/container-platform/4.15/rest_api/workloads_apis/pod-v1.html","title":"Pod [v1]"},{"docs_url":"https://docs.openshift.com/container-platform/4.15/nodes/index.html","title":"Overview of nodes"}],"truncated":false}

Streaming

curl -X POST http://localhost:8080/v1/streaming_query -H "Content-Type: application/json" -d '{"query": "write a short poem about running pods in openshift?"}'

In OpenShift's realm, where pods take flight,
Containers unite, in the day and the night.
With labels and volumes, they dance in a line,
Scaling up swiftly, like stars that align.

Immutable they stand, in their defined space,
Resilient and strong, they embrace every race.
From nodes they emerge, with IPs to claim,
In the orchestration of Kubernetes, they play the game.

So here's to the pods, in their vibrant array,
In OpenShift's embrace, they thrive every day.
With each deployment, a story unfolds,
In the world of containers, their magic beholds.

---

Using pods: https://docs.openshift.com/container-platform/4.15/nodes/pods/nodes-pods-using.html
Pod [v1]: https://docs.openshift.com/container-platform/4.15/rest_api/workloads_apis/pod-v1.html
Overview of nodes: https://docs.openshift.com/container-platform/4.15/nodes/index.html

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 5, 2024
@onmete
Copy link
Contributor Author

onmete commented Dec 5, 2024

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 5, 2024
@openshift-ci openshift-ci bot requested review from bparees and tisnik December 5, 2024 10:53
@openshift-ci-robot
Copy link

openshift-ci-robot commented Dec 9, 2024

@onmete: This pull request references OLS-39 which is a valid jira issue.

In response to this:

Description

Add streaming_query endpoint

Type of change

  • New feature

TODO

  • decide if this is how we want to implement it
  • docs/schema
  • fix tests

Preview

Non-streaming

curl -X POST http://localhost:8080/v1/query -H "Content-Type: application/json" -d '{"query": "write a short poem about running pods in openshift?"}'

{"conversation_id":"6adc636b-22e3-45cc-8f01-f73fdc2490bc","response":"In OpenShift's realm, where pods take flight,  \nContainers unite, in harmony bright.  \nWith labels and volumes, they dance and they play,  \nOn nodes they reside, come night or come day.  \n\nImmutable spirits, they rise and they fall,  \nManaged by controllers, they heed the call.  \nFrom dev to production, they scale with great ease,  \nIn this orchestration, they flow like the breeze.  \n\nSo here's to the pods, in their vibrant array,  \nIn OpenShift's embrace, they thrive and they sway.  \nWith each deployment, a story unfolds,  \nIn the world of containers, their magic beholds.","referenced_documents":[{"docs_url":"https://docs.openshift.com/container-platform/4.15/nodes/pods/nodes-pods-using.html","title":"Using pods"},{"docs_url":"https://docs.openshift.com/container-platform/4.15/rest_api/workloads_apis/pod-v1.html","title":"Pod [v1]"},{"docs_url":"https://docs.openshift.com/container-platform/4.15/nodes/index.html","title":"Overview of nodes"}],"truncated":false}

Streaming

curl -X POST http://localhost:8080/v1/streaming_query -H "Content-Type: application/json" -d '{"query": "write a short poem about running pods in openshift?"}'

In OpenShift's realm, where pods take flight,
Containers unite, in the day and the night.
With labels and volumes, they dance in a line,
Scaling up swiftly, like stars that align.

Immutable they stand, in their defined space,
Resilient and strong, they embrace every race.
From nodes they emerge, with IPs to claim,
In the orchestration of Kubernetes, they play the game.

So here's to the pods, in their vibrant array,
In OpenShift's embrace, they thrive every day.
With each deployment, a story unfolds,
In the world of containers, their magic beholds.

---

Using pods: https://docs.openshift.com/container-platform/4.15/nodes/pods/nodes-pods-using.html
Pod [v1]: https://docs.openshift.com/container-platform/4.15/rest_api/workloads_apis/pod-v1.html
Overview of nodes: https://docs.openshift.com/container-platform/4.15/nodes/index.html

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Dec 9, 2024

@onmete: This pull request references OLS-39 which is a valid jira issue.

In response to this:

Description

Add streaming_query endpoint

Type of change

  • New feature

TODO

  • decide if this is how we want to implement it
  • additional minor code org
  • docs/schema
  • fix tests

Preview

Non-streaming

curl -X POST http://localhost:8080/v1/query -H "Content-Type: application/json" -d '{"query": "write a short poem about running pods in openshift?"}'

{"conversation_id":"6adc636b-22e3-45cc-8f01-f73fdc2490bc","response":"In OpenShift's realm, where pods take flight,  \nContainers unite, in harmony bright.  \nWith labels and volumes, they dance and they play,  \nOn nodes they reside, come night or come day.  \n\nImmutable spirits, they rise and they fall,  \nManaged by controllers, they heed the call.  \nFrom dev to production, they scale with great ease,  \nIn this orchestration, they flow like the breeze.  \n\nSo here's to the pods, in their vibrant array,  \nIn OpenShift's embrace, they thrive and they sway.  \nWith each deployment, a story unfolds,  \nIn the world of containers, their magic beholds.","referenced_documents":[{"docs_url":"https://docs.openshift.com/container-platform/4.15/nodes/pods/nodes-pods-using.html","title":"Using pods"},{"docs_url":"https://docs.openshift.com/container-platform/4.15/rest_api/workloads_apis/pod-v1.html","title":"Pod [v1]"},{"docs_url":"https://docs.openshift.com/container-platform/4.15/nodes/index.html","title":"Overview of nodes"}],"truncated":false}

Streaming

curl -X POST http://localhost:8080/v1/streaming_query -H "Content-Type: application/json" -d '{"query": "write a short poem about running pods in openshift?"}'

In OpenShift's realm, where pods take flight,
Containers unite, in the day and the night.
With labels and volumes, they dance in a line,
Scaling up swiftly, like stars that align.

Immutable they stand, in their defined space,
Resilient and strong, they embrace every race.
From nodes they emerge, with IPs to claim,
In the orchestration of Kubernetes, they play the game.

So here's to the pods, in their vibrant array,
In OpenShift's embrace, they thrive every day.
With each deployment, a story unfolds,
In the world of containers, their magic beholds.

---

Using pods: https://docs.openshift.com/container-platform/4.15/nodes/pods/nodes-pods-using.html
Pod [v1]: https://docs.openshift.com/container-platform/4.15/rest_api/workloads_apis/pod-v1.html
Overview of nodes: https://docs.openshift.com/container-platform/4.15/nodes/index.html

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 10, 2024
@@ -184,6 +145,64 @@ def conversation_request(
)


def process_request(auth: Any, llm_request: LLMRequest):
"""Process incoming request."""
timestamps = {"start": time.time()}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change will open the type from float to any if I'm reading it correctly. Is there a reason to do this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just removes the type hinting for this internal variable. Linter is happy (or can't understand the missing type) so I find it unnecessary for this "debug" type of variable. It can be timestamps: dict[str: float] = {"start": time.time()} ofc.

@onmete
Copy link
Contributor Author

onmete commented Dec 11, 2024

/unhold

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 11, 2024
@onmete onmete force-pushed the streaming-response branch from f2115b6 to 8396db7 Compare December 11, 2024 07:56
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 11, 2024
@onmete onmete marked this pull request as ready for review December 11, 2024 08:06
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 11, 2024
@openshift-ci openshift-ci bot requested a review from xrajesh December 11, 2024 08:06
@onmete onmete force-pushed the streaming-response branch from 8396db7 to 5f242a7 Compare December 11, 2024 09:19
Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd change all those if media_type==TEXT ... else ...JSON into switch over all media types. Also by using StrEnum instead of string constant, the code will be expandable and less direct checks will be needed in model code. Look into constants.py for examples please.

README.md Show resolved Hide resolved

timestamps["validate question"] = time.time()

return (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the return type is precise and known in advance, IMHO should be used in function header

response = docs_summarizer.create_response(
llm_request.query, config.rag_index, history
)
logger.debug(f"{conversation_id} Generated response: {response}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't use f-string in anything related to logger

ref_docs_string = "\n".join(
f"{title}: {url}"
for title, url in {
rag_chunk.doc_title: rag_chunk.doc_url for rag_chunk in rag_chunks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does not it work differently than JSON output? I mean there is just one yield, not yield per doc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is one yield for text type and multiple yields for json (per doc)

@codecov-commenter
Copy link

codecov-commenter commented Dec 12, 2024

Codecov Report

Attention: Patch coverage is 96.15385% with 6 lines in your changes missing coverage. Please review.

Project coverage is 96.93%. Comparing base (8757bf9) to head (b6413b1).
Report is 81 commits behind head on main.

Files with missing lines Patch % Lines
ols/app/endpoints/streaming_ols.py 93.97% 5 Missing ⚠️
ols/src/query_helpers/docs_summarizer.py 96.96% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2014      +/-   ##
==========================================
+ Coverage   96.89%   96.93%   +0.03%     
==========================================
  Files          72       73       +1     
  Lines        2932     3064     +132     
==========================================
+ Hits         2841     2970     +129     
- Misses         91       94       +3     
Files with missing lines Coverage Δ
ols/app/endpoints/ols.py 99.54% <100.00%> (+0.01%) ⬆️
ols/app/models/models.py 100.00% <100.00%> (ø)
ols/app/routers.py 100.00% <100.00%> (ø)
ols/constants.py 100.00% <100.00%> (ø)
ols/src/query_helpers/docs_summarizer.py 98.50% <96.96%> (-1.50%) ⬇️
ols/app/endpoints/streaming_ols.py 93.97% <93.97%> (ø)

... and 3 files with indirect coverage changes

@onmete onmete force-pushed the streaming-response branch from 756e359 to a8085de Compare December 12, 2024 10:40
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 20, 2024
@onmete onmete force-pushed the streaming-response branch from ec28a50 to a964e45 Compare January 5, 2025 08:37
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 5, 2025
@onmete onmete force-pushed the streaming-response branch from 0d6f479 to c576edd Compare January 6, 2025 07:33
@onmete onmete force-pushed the streaming-response branch 2 times, most recently from e06ba43 to f45c711 Compare January 8, 2025 09:22
@onmete
Copy link
Contributor Author

onmete commented Jan 8, 2025

/test 4.17-e2e-ols-cluster

@TamiTakamiya
Copy link
Contributor

Can we expect this will be ported to road-core/service eventually?

@onmete
Copy link
Contributor Author

onmete commented Jan 9, 2025

@TamiTakamiya hopefully yes.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 9, 2025
@onmete onmete force-pushed the streaming-response branch from dce5295 to f62aeb6 Compare January 9, 2025 12:34
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 9, 2025
@onmete onmete force-pushed the streaming-response branch from 400da31 to adba5b3 Compare January 9, 2025 15:05
@onmete onmete force-pushed the streaming-response branch from adba5b3 to 9a7d90d Compare January 9, 2025 16:35
@@ -132,7 +98,7 @@ def conversation_request(
else:
summarizer_response = generate_response(
conversation_id, llm_request, previous_input
)
) # type: ignore[assignment]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need those "ignores"? Is it because of Mypy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

generated_content = ""

async for item in summary_gen:
if isinstance(item, str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just curious - does the async gen return something else than string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SummarizerResponse

@tisnik
Copy link
Contributor

tisnik commented Jan 10, 2025

Can we expect this will be ported to road-core/service eventually?

I'll do it ASAP

@tisnik
Copy link
Contributor

tisnik commented Jan 10, 2025

/approve

Copy link

openshift-ci bot commented Jan 10, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: tisnik

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 10, 2025
README.md Outdated Show resolved Hide resolved
@JoaoFula
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jan 14, 2025
@onmete
Copy link
Contributor Author

onmete commented Jan 14, 2025

/test 4.17-e2e-ols-cluster

@tisnik
Copy link
Contributor

tisnik commented Jan 14, 2025

/retest

@tisnik
Copy link
Contributor

tisnik commented Jan 14, 2025

/override "Red Hat Konflux / ols-enterprise-contract / lightspeed-service"
/override "ci/prow/images"

Copy link

openshift-ci bot commented Jan 14, 2025

@tisnik: Overrode contexts on behalf of tisnik: Red Hat Konflux / ols-enterprise-contract / lightspeed-service, ci/prow/images

In response to this:

/override "Red Hat Konflux / ols-enterprise-contract / lightspeed-service"
/override "ci/prow/images"

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@tisnik
Copy link
Contributor

tisnik commented Jan 14, 2025

/retest

Copy link

openshift-ci bot commented Jan 14, 2025

@onmete: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit d3d905f into openshift:main Jan 14, 2025
13 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants