Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New serverless pattern - Accessing AWS Sagemaker Endpoint via API Gateway and Lambda #1531

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions apigw-lambda-sagemaker-jumpstartendpoint-cdk-python/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
*.swp
package-lock.json
__pycache__
.pytest_cache
.venv
*.egg-info

# CDK asset staging directory
.cdk.staging
cdk.out
105 changes: 105 additions & 0 deletions apigw-lambda-sagemaker-jumpstartendpoint-cdk-python/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
# Accessing Amazon SageMaker Endpoint via API Gateway and Lambda

This pattern deploys a SageMaker Jumpstart model (Flan T5 XL) endpoint. It also adds a Lambda function and API Gateway integration to query the endpoint

Learn more about this pattern at Serverless Land Patterns: https://serverlessland.com/patterns/apigw-lambda-sagemaker-jumpstartendpoint-cdk-python

Important: this application uses various AWS services and there are costs associated with these services after the Free Tier usage - please see the [AWS Pricing page](https://aws.amazon.com/pricing/) for details. You are responsible for any AWS costs incurred. No warranty is implied in this example.

## Requirements

* [Create an AWS account](https://portal.aws.amazon.com/gp/aws/developer/registration/index.html) if you do not already have one and log in. The IAM user that you use must have sufficient permissions to make necessary AWS service calls and manage AWS resources.
* [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) installed and configured
* [Git Installed](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git)
* [Python, pip, virtualenv](https://docs.aws.amazon.com/cdk/latest/guide/work-with-cdk-python.html) installed
* [AWS Cloud Development Kit](https://docs.aws.amazon.com/cdk/v2/guide/getting_started.html) (AWS CDK >= 2.2.0) Installed

## Deployment Instructions

1. Clone the project to your local working directory

```sh
git clone https://github.com/aws-samples/serverless-patterns
```

2. Change to the pattern directory:
```sh
cd serverless-patterns/apigw-lambda-sagemaker-jumpstartendpoint-cdk-python
```
3. Create and activate the project's virtual environment. This allows the project's dependencies to be installed locally in the project folder, instead of globally. Note that if you have multiple versions of Python installed, where the `python` command references Python 2.x, then you can reference Python 3.x by using the `python3` command. You can check which version of Python is being referenced by running the command `python --version` or `python3 --version`

```sh
python3 -m venv .venv
source .venv/bin/activate
```

4. Install the project dependencies

```sh
python -m pip install -r requirements.txt
```

5. Deploy the stack to your default AWS account and region.

```sh
cdk deploy
```

6. The default instance count for inference is set to 1. The instance count can be changed by passing the instance_count_param.

```sh
cdk deploy --context instance_count_param=2
```


## How it works

1. This pattern deploys a SageMaker Jumpstart model (Flan T5 XL from HuggingFace) endpoint using Amazon SageMaker. The model can be changed by modifying the ```MODEL_ID``` attribute in app.py file.

2. The pattern also adds a lambda and an API Gateway query the endpoint.

3. The API Gateway is protected using an API Key. To query the Api Gateway, ```x-api-key``` header needs to be added to the HTTP request.


## Testing

1. Retrieve the Host URL of the API Gateway from AWS Console.

2. Retrieve the API key from AWS Console.

3. Send a sample HTTP request to the API Gateway:
```
POST /prod/generateimage HTTP/1.1
Host: <Host URL of the API Gateway>
x-api-key: <Retreive the key from AWS Console>
Content-Type: application/json
Cache-Control: no-cache
{
"query": {
"text_inputs": "A step by step recipe to make butter chicken:",
"max_length": 5000
}
}

```

4. The API Gateway should respond with a JSON formatted response from the SageMaker endpoint


## Cleanup

Run the given command to delete the resources that were created. It might take some time for the CloudFormation stack to get deleted.

```sh
cdk destroy
```

## Author bio
Kaustav Dey,
https://www.linkedin.com/in/kaustavbecs/
Solution Architect

----
Copyright 2023 Amazon.com, Inc. or its affiliates. All Rights Reserved.

SPDX-License-Identifier: MIT-0
37 changes: 37 additions & 0 deletions apigw-lambda-sagemaker-jumpstartendpoint-cdk-python/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
#!/usr/bin/env python3
import aws_cdk as cdk
import boto3
from stack.ApigwLambdaSagemakerJumpstartendpointStack import ApigwLambdaSagemakerJumpstartendpointStack
from util.sagemaker_util import *

region_name = boto3.Session().region_name
env = {"region": region_name}


# Obtain the model ID from: https://sagemaker.readthedocs.io/en/v2.173.0/doc_utils/pretrainedmodels.html
# Here we are using Flan T5 XL Model
MODEL_ID = "huggingface-text2text-flan-t5-xl"

# Change the instance type to match your model.
# For GPU Service Quota related issues, please raise a quota request from AWS console
INFERENCE_INSTANCE_TYPE = "ml.g5.2xlarge"

# Name of the stack:
STACK_NAME = "apigw-lambda-sagemaker-jumpstartendpoint-stack"



MODEL_INFO = get_sagemaker_uris(model_id=MODEL_ID,
instance_type=INFERENCE_INSTANCE_TYPE,
region_name=region_name)

app = cdk.App()

stack = ApigwLambdaSagemakerJumpstartendpointStack(
app,
STACK_NAME,
model_info=MODEL_INFO,
env=env
)

app.synth()
52 changes: 52 additions & 0 deletions apigw-lambda-sagemaker-jumpstartendpoint-cdk-python/cdk.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
{
"app": "python3 app.py",
"watch": {
"include": [
"**"
],
"exclude": [
"README.md",
"cdk*.json",
"requirements*.txt",
"source.bat",
"**/__init__.py",
"python/__pycache__",
"tests"
]
},
"context": {
"@aws-cdk/aws-lambda:recognizeLayerVersion": true,
"@aws-cdk/core:checkSecretUsage": true,
"@aws-cdk/core:target-partitions": [
"aws",
"aws-cn"
],
"@aws-cdk-containers/ecs-service-extensions:enableDefaultLogDriver": true,
"@aws-cdk/aws-ec2:uniqueImdsv2TemplateName": true,
"@aws-cdk/aws-ecs:arnFormatIncludesClusterName": true,
"@aws-cdk/aws-iam:minimizePolicies": true,
"@aws-cdk/core:validateSnapshotRemovalPolicy": true,
"@aws-cdk/aws-codepipeline:crossAccountKeyAliasStackSafeResourceName": true,
"@aws-cdk/aws-s3:createDefaultLoggingPolicy": true,
"@aws-cdk/aws-sns-subscriptions:restrictSqsDescryption": true,
"@aws-cdk/aws-apigateway:disableCloudWatchRole": true,
"@aws-cdk/core:enablePartitionLiterals": true,
"@aws-cdk/aws-events:eventsTargetQueueSameAccount": true,
"@aws-cdk/aws-iam:standardizedServicePrincipals": true,
"@aws-cdk/aws-ecs:disableExplicitDeploymentControllerForCircuitBreaker": true,
"@aws-cdk/aws-iam:importedRoleStackSafeDefaultPolicyName": true,
"@aws-cdk/aws-s3:serverAccessLogsUseBucketPolicy": true,
"@aws-cdk/aws-route53-patters:useCertificate": true,
"@aws-cdk/customresources:installLatestAwsSdkDefault": false,
"@aws-cdk/aws-rds:databaseProxyUniqueResourceName": true,
"@aws-cdk/aws-codedeploy:removeAlarmsFromDeploymentGroup": true,
"@aws-cdk/aws-apigateway:authorizerChangeDeploymentLogicalId": true,
"@aws-cdk/aws-ec2:launchTemplateDefaultUserData": true,
"@aws-cdk/aws-secretsmanager:useAttachedSecretResourcePolicyForSecretTargetAttachments": true,
"@aws-cdk/aws-redshift:columnId": true,
"@aws-cdk/aws-stepfunctions-tasks:enableEmrServicePolicyV2": true,
"@aws-cdk/aws-ec2:restrictDefaultSecurityGroup": true,
"@aws-cdk/aws-apigateway:requestValidatorUniqueId": true,
"@aws-cdk/aws-kms:aliasNameRef": true
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
{
"title": "Accessing Amazon SageMaker Endpoint via API Gateway and Lambda",
"description": "This pattern deploys a SageMaker Jumpstart model (Flan T5 XL) endpoint. It also uses a Lambda function and API Gateway to integrate the endpoint",
"language": "Python",
"level": "300",
"framework": "CDK",
"introBox": {
"headline": "How it works",
"text": [
"This pattern deploys a Amazon SageMaker Jumpstart model (Flan T5 XL from HuggingFace) endpoint using Amazon SageMaker. The model can be changed by modifying the MODEL_ID attribute in app.py file.",
"The pattern also adds a Lambda function with API Gateway integration to query the endpoint.",
"The API Gateway is protected using an API Key. To query the API Gateway endpoint, x-api-key header needs to be added to the HTTP request."
]
},
"gitHub": {
"template": {
"repoURL": "https://github.com/aws-samples/serverless-patterns/tree/main/apigw-lambda-sagemaker-jumpstartendpoint-cdk-python",
"templateURL": "serverless-patterns/apigw-lambda-sagemaker-jumpstartendpoint-cdk-python",
"projectFolder": "apigw-lambda-sagemaker-jumpstartendpoint-cdk-python",
"templateFile": "stack/ApigwLambdaSagemakerJumpstartendpointStack.py"
}
},
"resources": {
"bullets": [
{
"text": "Zero-shot prompting for the Flan-T5 foundation model in Amazon SageMaker JumpStart",
"link": "https://aws.amazon.com/blogs/machine-learning/zero-shot-prompting-for-the-flan-t5-foundation-model-in-amazon-sagemaker-jumpstart/"
},
{
"text": "Instruction fine-tuning for FLAN T5 XL with Amazon SageMaker Jumpstart",
"link": "https://aws.amazon.com/blogs/machine-learning/instruction-fine-tuning-for-flan-t5-xl-with-amazon-sagemaker-jumpstart/"
}
]
},
"deploy": {
"text": [
"cdk deploy"
]
},
"testing": {
"text": [
"See the GitHub repo for detailed testing instructions."
]
},
"cleanup": {
"text": [
"Delete the stack: <code>cdk destroy</code>."
]
},
"authors": [
{
"name": "Kaustav Dey",
"image": "https://avatars.githubusercontent.com/u/13236519",
"bio": "Solution Architect at AWS",
"linkedin": "https://www.linkedin.com/in/kaustavbecs/"
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
import json
import os
import boto3


def lambda_handler(event, context):

# Extract the input data from the request body
input_payload = json.loads(event['body'])['query']

# Fetch the endpoint name from the environment variable
endpoint_name = os.environ['SAGEMAKER_ENDPOINT_NAME']

# Create a SageMaker runtime client
runtime = boto3.client('sagemaker-runtime')

try:
"""Query the SageMaker endpoint."""
payload = json.dumps(input_payload)
response = runtime.invoke_endpoint(EndpointName=endpoint_name,
ContentType='application/json',
Body=payload)
query_response = response['Body'].read().decode('utf-8')

# Return the result as the Lambda function response
return {
'statusCode': 200,
'body': query_response
}
except Exception as e:
# Handle any errors that occur during the invocation
return {
'statusCode': 500,
'body': str(e)
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
aws-cdk-lib==2.83.1
constructs>=10.0.0,<11.0.0
requests
boto3
sagemaker
13 changes: 13 additions & 0 deletions apigw-lambda-sagemaker-jumpstartendpoint-cdk-python/source.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
@echo off

rem The sole purpose of this script is to make the command
rem
rem source .venv/bin/activate
rem
rem (which activates a Python virtualenv on Linux or Mac OS X) work on Windows.
rem On Windows, this command just runs this batch file (the argument is ignored).
rem
rem Now we don't need to document a Windows command for activating a virtualenv.

echo Executing .venv\Scripts\activate.bat for you
.venv\Scripts\activate.bat
Loading
Loading