-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: usage of ReadDataFromJson in array tensors #7624
fix: usage of ReadDataFromJson in array tensors #7624
Conversation
@@ -41,6 +41,12 @@ input [ | |||
name: "STREAM" | |||
data_type: TYPE_BOOL | |||
dims: [ 1, 1 ] | |||
}, | |||
{ | |||
name: "input_ids" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a new input to test this? Or can we validate the same behavior (support input with list of size > 1 IIUC), by sending multiple prompts? ex: "PROMPT": ["hello", "world"]
with updated dims: [1, -1]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we modify the PROMT like above, the test case passes with both old and new code (with http_server.cc change).
I had to undo the commit 229e5e8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just to confirm that the added test case will fail without this http_server.cc
change?
@GuanLuo , thanks for the note. The test case passed with both old and new code (with http_server.cc change). Now the test case fails on 24.09 with the old code and passes with the new code (including the http_server.cc change). |
What does the PR do?
The
generate
andgenerate_stream
endpoints did not seem to work when directly querying TRTLLM backend with input tokens. This is because theHTTPAPIServer::GenerateRequestClass::ExactMappingInput
does not send the correct size of an array input toReadDataFromJson
.This PR also fixes triton-inference-server/tensorrtllm_backend#369
Checklist
Agreement
<commit_type>: <Title>
pre-commit install, pre-commit run --all
)Commit Type:
Check the conventional commit type
box here and add the label to the github PR.
Related PRs:
Where should the reviewer start?
Test plan:
Added a new test case to
L0_http
job.Internal CI pipeline id: 18800660
Caveats:
Background
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)