You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to load some models at sagemaker endpoint server startup to make them already available on model prediction requests to skip the loading step phase on first request.
I've configured the mms with the following parameters accordingly to the mms documentation:
model_store = '/'
default_workers_per_model = 1
preload_model = 'true'
load_models = .. # the container local path where i store the model.
The model is a decompressed tar.gz archive generated through sagemaker training process plus a MAR-INF/MANIFEST.json directory with the model_name information.
From cloudwatch logs i see the model has been loaded correctly on a worker thread which immediatly stops after scale-down call.
Following some screen with the logs.
The configuration:
The load-scale down:
I don't see errors in the logs: what's going on? Is it a bug?
Best regards.
The text was updated successfully, but these errors were encountered:
Hi,
I'm trying to load some models at sagemaker endpoint server startup to make them already available on model prediction requests to skip the loading step phase on first request.
I've configured the mms with the following parameters accordingly to the mms documentation:
The model is a decompressed tar.gz archive generated through sagemaker training process plus a MAR-INF/MANIFEST.json directory with the model_name information.
From cloudwatch logs i see the model has been loaded correctly on a worker thread which immediatly stops after scale-down call.
Following some screen with the logs.
The configuration:
The load-scale down:
I don't see errors in the logs: what's going on? Is it a bug?
Best regards.
The text was updated successfully, but these errors were encountered: