GitHub - Architectshwet/Deployed-multiple-Transformers-models-using-Amazon-SageMaker-Multi-Model-Endpoints

Deployed multiple Transformers models using Amazon SageMaker Multi-Model Endpoints

With Amazon SageMaker multi-model endpoints, customers can create an endpoint that seamlessly hosts up to thousands of models. These endpoints are well suited to use cases where any one of many models, which can be served from a common inference container, needs to be callable on-demand and where it is acceptable for infrequently invoked models to incur some additional latency.

We covered the steps below in this project.

Development Environment and Permissions
Retrieve Model Artifacts
Write the Inference Script
Package Models
Upload multiple Hugging Face models to S3
Create Multi-Model Endpoint
Get Predictions
Dynamically deploying models and Updating a model to the endpoint
Delete the Multi-Model Endpoint

Please refer to the Medium article for detailed information.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.ipynb_checkpoints		.ipynb_checkpoints
models		models
source_dir		source_dir
README.md		README.md
huggingface-sagemaker-multi-model-endpoint.ipynb		huggingface-sagemaker-multi-model-endpoint.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deployed multiple Transformers models using Amazon SageMaker Multi-Model Endpoints

About

Releases

Packages

Languages

Architectshwet/Deployed-multiple-Transformers-models-using-Amazon-SageMaker-Multi-Model-Endpoints

Folders and files

Latest commit

History

Repository files navigation

Deployed multiple Transformers models using Amazon SageMaker Multi-Model Endpoints

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages