You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
sent requests with client timeouts (load our modelmesh)
after some time, client starts to receive
ERROR:
Code: Internal
Message: org.pytorch.serve.grpc.inference.InferenceAPIsService/Predictions: INTERNAL: Model "test-search-vectorizer__isvc-409d074e44" has no worker to serve inference request. Please use scale workers API to add workers.
if we stop our load and start to send only one request without timeouts, we have same error or sticking
In my opinion, some resources can't released under client timeout, that lead to workers exousted. Becouse if we turn off client timeouts, there are no errors.
Expected behaivor
requests with timeouts can't lead to workers exousted and sticking on single request to modelmesh
Log Info
message
org.pytorch.serve.grpc.inference.InferenceAPIsService/Predictions: INTERNAL: Model "test-search-vectorizer__isvc-409d074e44" has no worker to serve inference request. Please use scale workers API to add workers.
InternalServerException.()
trace
ApplierException(message:org.pytorch.serve.grpc.inference.InferenceAPIsService/Predictions: INTERNAL: Model "test-search-vectorizer__isvc-409d074e44" has no worker to serve inference request. Please use scale workers API to add workers.
InternalServerException.(), causeStacktrace:io.grpc.StatusException: INTERNAL: Model "test-search-vectorizer__isvc-409d074e44" has no worker to serve inference request. Please use scale workers API to add workers.
InternalServerException.()
at com.ibm.watson.modelmesh.SidecarModelMesh$ExternalModel$1.onClose(SidecarModelMesh.java:450)
, grpcStatusCode:INTERNAL)
at com.ibm.watson.modelmesh.thrift.ModelMeshService$applyModelMulti_result$applyModelMulti_resultStandardScheme.read(ModelMeshService.java:1940)
at com.ibm.watson.modelmesh.thrift.ModelMeshService$applyModelMulti_result$applyModelMulti_resultStandardScheme.read(ModelMeshService.java:1905)
at com.ibm.watson.modelmesh.thrift.ModelMeshService$applyModelMulti_result.read(ModelMeshService.java:1815)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:93)
at com.ibm.watson.modelmesh.thrift.ModelMeshService$Client.recv_applyModelMulti(ModelMeshService.java:74)
at com.ibm.watson.modelmesh.thrift.ModelMeshService$Client.applyModelMulti(ModelMeshService.java:59)
at jdk.internal.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at com.ibm.watson.litelinks.client.ClientInvocationHandler.invoke(ClientInvocationHandler.java:496)
at com.ibm.watson.litelinks.client.ClientInvocationHandler.invokeWithRetries(ClientInvocationHandler.java:383)
at com.ibm.watson.litelinks.client.ClientInvocationHandler.doInvoke(ClientInvocationHandler.java:184)
at com.ibm.watson.litelinks.client.ClientInvocationHandler.invoke(ClientInvocationHandler.java:118)
at jdk.proxy2/jdk.proxy2.$Proxy30.applyModelMulti(Unknown Source)
at jdk.internal.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at com.ibm.watson.modelmesh.SidecarModelMesh.invokeRemoteModel(SidecarModelMesh.java:1071)
at com.ibm.watson.modelmesh.ModelMesh.invokeRemote(ModelMesh.java:4399)
at com.ibm.watson.modelmesh.ModelMesh.invokeModel(ModelMesh.java:3644)
at com.ibm.watson.modelmesh.SidecarModelMesh.callModel(SidecarModelMesh.java:1106)
at com.ibm.watson.modelmesh.ModelMeshApi.callModel(ModelMeshApi.java:457)
at com.ibm.watson.modelmesh.ModelMeshApi$4.onHalfClose(ModelMeshApi.java:733)
at io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:355)
at io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:867)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:833)
at com.ibm.watson.litelinks.server.ServerRequestThread.run(ServerRequestThread.java:47)
The text was updated successfully, but these errors were encountered:
@fsatka -- it's been a long time since you opened this issue and several updates have been made since. Can you still reproduce this bug in the latest version of ModelMesh-Serving v0.11.2?
ServingRuntime:
torchserve
Current behavior
In my opinion, some resources can't released under client timeout, that lead to workers exousted. Becouse if we turn off client timeouts, there are no errors.
Expected behaivor
Log Info
The text was updated successfully, but these errors were encountered: