Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronization issue when the model is just launched #170

Open
kpouget opened this issue Oct 24, 2023 · 3 comments
Open

Synchronization issue when the model is just launched #170

kpouget opened this issue Oct 24, 2023 · 3 comments
Assignees

Comments

@kpouget
Copy link

kpouget commented Oct 24, 2023

Describe the bug

There is a synchronization issue at the launch of the Pod with the current images:

  • the containers get all Ready:
flan-t5-small-gpu-predictor-00001-deployment-6768c548d8-8btqc   4/4     Running   0          41s
  • the model appears as Loaded in the inference service:
  modelStatus:
    copies:
      failedCopies: 0
      totalCopies: 1
    states:
      activeModelState: Loaded
      targetModelState: Loaded
  • but the model takes several extra seconds to be able to serve requests:
HOST=...
METHOD=caikit.runtime.Nlp.NlpService/TextGenerationTaskPredict
while true; do
  GRPCURL_DATA=$(echo "{'max_new_tokens': 25, 'min_new_tokens': 25, 'text': 'At what temperature does liquid Nitrogen boil?'}" | sed "s/'/\"/g")
  grpcurl  -insecure  -d "$GRPCURL_DATA"  -H "mm-model-id: flan-t5-small-caikit"  $HOST  $METHOD
  sleep 1
done

ERROR:
  Code: Internal
  Message: Unhandled exception during prediction
ERROR:
  Code: Internal
  Message: Unhandled exception during prediction
ERROR:
  Code: Internal
  Message: Unhandled exception during prediction
ERROR:
  Code: Internal
  Message: Unhandled exception during prediction
ERROR:
  Code: Internal
  Message: Unhandled exception during prediction
ERROR:
  Code: Internal
  Message: Unhandled exception during prediction
ERROR:
  Code: Internal
  Message: Unhandled exception during prediction
{
  "generated_text": "74 degrees F.C., a temperature of 74 degrees F.C., a temperature of ",
  "generated_tokens": "25",
  "finish_reason": "MAX_TOKENS",
  "producer_id": {
    "name": "Text Generation",
    "version": "0.1.0"
  },
  "input_token_count": "10"
}

in the transformer-container logs, we can see this error:

{"channel": "GP-SERVICR-I", "exception": null, "level": "warning", "log_code": "<RUN49049070W>", "message": "<_InactiveRpcError of RPC that terminated with:
\tstatus = StatusCode.UNAVAILABLE
\tdetails = \"failed to connect to all addresses; last error: UNKNOWN: ipv4:127.0.0.1:8033: Failed to connect to remote host: Connection refused\"
\tdebug_error_string = \"UNKNOWN:failed to connect to all addresses; last error: UNKNOWN: ipv4:127.0.0.1:8033: Failed to connect to remote host: Connection refused {created_time:\"2023-10-24T11:48:51.016344787+00:00\", grpc_status:14}\"
>", "model_id": "flan-t5-small-caikit", "num_indent": 0, "stack_trace": "Traceback (most recent call last):
  File \"/caikit/lib/python3.9/site-packages/caikit/runtime/servicers/global_predict_servicer.py\", line 283, in _handle_predict_exceptions
    yield
  File \"/caikit/lib/python3.9/site-packages/caikit/runtime/servicers/global_predict_servicer.py\", line 260, in predict_model
    response = work.do()
  File \"/caikit/lib/python3.9/site-packages/caikit/runtime/work_management/abortable_action.py\", line 118, in do
    return self.__work_thread.get_or_throw()
  File \"/caikit/lib/python3.9/site-packages/caikit/core/toolkit/destroyable_thread.py\", line 188, in get_or_throw
    raise self.__runnable_exception
  File \"/caikit/lib/python3.9/site-packages/caikit/core/toolkit/destroyable_thread.py\", line 124, in run
    self.__runnable_result = self.runnable_func(
  File \"/caikit/lib/python3.9/site-packages/caikit_nlp/modules/text_generation/text_generation_tgis.py\", line 237, in run
    return self.tgis_generation_client.unary_generate(
  File \"/caikit/lib/python3.9/site-packages/caikit_nlp/toolkit/text_generation/tgis_utils.py\", line 315, in unary_generate
    batch_response = self.tgis_client.Generate(request)
  File \"/caikit/lib64/python3.9/site-packages/grpc/_channel.py\", line 1161, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File \"/caikit/lib64/python3.9/site-packages/grpc/_channel.py\", line 1004, in _end_unary_response_blocking
    raise _InactiveRpcError(state)  # pytype: disable=not-instantiable
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
\tstatus = StatusCode.UNAVAILABLE
\tdetails = \"failed to connect to all addresses; last error: UNKNOWN: ipv4:127.0.0.1:8033: Failed to connect to remote host: Connection refused\"
\tdebug_error_string = \"UNKNOWN:failed to connect to all addresses; last error: UNKNOWN: ipv4:127.0.0.1:8033: Failed to connect to remote host: Connection refused {created_time:\"2023-10-24T11:48:51.016344787+00:00\", grpc_status:14}\"
>
", "thread_id": 140123215742720, "timestamp": "2023-10-24T11:48:51.017178"}

Platform

  • quay.io/opendatahub/text-generation-inference@sha256:0e3d00961fed95a8f8b12ed7ce50305acbbfe37ee33d37e81ba9e7ed71c73b69
  • quay.io/opendatahub/caikit-tgis-serving@sha256:adb8d1153b900e304fbcc934189c68cffea035d4b82848446c72c3d5554ee0ca

Sample Code

caikit_tgit_config.yaml.log
inference_service.yaml.log
serving_runtime.yaml.log

@dtrifiro

This comment was marked as resolved.

@kpouget

This comment was marked as outdated.

@dtrifiro dtrifiro self-assigned this Nov 28, 2023
@dtrifiro dtrifiro transferred this issue from opendatahub-io/caikit-tgis-backend Nov 28, 2023
@dtrifiro
Copy link
Contributor

This will be fixed when #156 is closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Status: No status
Status: In Progress
Development

No branches or pull requests

2 participants