Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: usage of ReadDataFromJson in array tensors #7624

Merged

Conversation

v-shobhit
Copy link
Contributor

@v-shobhit v-shobhit commented Sep 18, 2024

What does the PR do?

The generate and generate_stream endpoints did not seem to work when directly querying TRTLLM backend with input tokens. This is because the HTTPAPIServer::GenerateRequestClass::ExactMappingInput does not send the correct size of an array input to ReadDataFromJson.

This PR also fixes triton-inference-server/tensorrtllm_backend#369

Checklist

  • I have read the Contribution guidelines and signed the Contributor License
    Agreement
  • PR title reflects the change and is of format <commit_type>: <Title>
  • Changes are described in the pull request.
  • Related issues are referenced.
  • Populated github labels field
  • Added test plan and verified test passes.
  • Verified that the PR passes existing CI.
  • I ran pre-commit locally (pre-commit install, pre-commit run --all)
  • Verified copyright is correct on all changed files.
  • Added succinct git squash message before merging ref.
  • All template sections are filled out.
  • Optional: Additional screenshots for behavior/output changes with before/after.

Commit Type:

Check the conventional commit type
box here and add the label to the github PR.

  • build
  • ci
  • docs
  • feat
  • fix
  • perf
  • refactor
  • revert
  • style
  • test

Related PRs:

Where should the reviewer start?

Test plan:

Added a new test case to L0_http job.
Internal CI pipeline id: 18800660

Caveats:

Background

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

@v-shobhit v-shobhit changed the title fix usage of ReadDataFromJson fix: usage of ReadDataFromJson in array tensors Sep 18, 2024
@@ -41,6 +41,12 @@ input [
name: "STREAM"
data_type: TYPE_BOOL
dims: [ 1, 1 ]
},
{
name: "input_ids"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a new input to test this? Or can we validate the same behavior (support input with list of size > 1 IIUC), by sending multiple prompts? ex: "PROMPT": ["hello", "world"] with updated dims: [1, -1]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we modify the PROMT like above, the test case passes with both old and new code (with http_server.cc change).
I had to undo the commit 229e5e8

GuanLuo
GuanLuo previously approved these changes Oct 1, 2024
Copy link
Contributor

@GuanLuo GuanLuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just to confirm that the added test case will fail without this http_server.cc change?

@pskiran1
Copy link
Member

pskiran1 commented Oct 3, 2024

LGTM, just to confirm that the added test case will fail without this http_server.cc change?

@GuanLuo , thanks for the note. The test case passed with both old and new code (with http_server.cc change).
I had to undo the commit 229e5e8, I think this bug is not happening for the String data type.
Could you please review the latest code(CI) and approve it?

Now the test case fails on 24.09 with the old code and passes with the new code (including the http_server.cc change).

@pskiran1 pskiran1 requested a review from GuanLuo October 4, 2024 04:44
@pskiran1 pskiran1 merged commit 6edd5c6 into triton-inference-server:main Oct 7, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Infer failed: Unable to parse 'data': Shape does not match true shape of 'data' field in generate endpoint
4 participants