-
Notifications
You must be signed in to change notification settings - Fork 928
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #1531 from kaustavbecs/kaustavbecs-feature-apigw-l…
…ambda-sagemaker-jumpstartendpoint-cdk-python
- Loading branch information
Showing
12 changed files
with
613 additions
and
0 deletions.
There are no files selected for viewing
10 changes: 10 additions & 0 deletions
10
apigw-lambda-sagemaker-jumpstartendpoint-cdk-python/.gitignore
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
*.swp | ||
package-lock.json | ||
__pycache__ | ||
.pytest_cache | ||
.venv | ||
*.egg-info | ||
|
||
# CDK asset staging directory | ||
.cdk.staging | ||
cdk.out |
105 changes: 105 additions & 0 deletions
105
apigw-lambda-sagemaker-jumpstartendpoint-cdk-python/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,105 @@ | ||
# Accessing Amazon SageMaker Endpoint via API Gateway and Lambda | ||
|
||
This pattern deploys a SageMaker Jumpstart model (Flan T5 XL) endpoint. It also adds a Lambda function and API Gateway integration to query the endpoint | ||
|
||
Learn more about this pattern at Serverless Land Patterns: https://serverlessland.com/patterns/apigw-lambda-sagemaker-jumpstartendpoint-cdk-python | ||
|
||
Important: this application uses various AWS services and there are costs associated with these services after the Free Tier usage - please see the [AWS Pricing page](https://aws.amazon.com/pricing/) for details. You are responsible for any AWS costs incurred. No warranty is implied in this example. | ||
|
||
## Requirements | ||
|
||
* [Create an AWS account](https://portal.aws.amazon.com/gp/aws/developer/registration/index.html) if you do not already have one and log in. The IAM user that you use must have sufficient permissions to make necessary AWS service calls and manage AWS resources. | ||
* [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) installed and configured | ||
* [Git Installed](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git) | ||
* [Python, pip, virtualenv](https://docs.aws.amazon.com/cdk/latest/guide/work-with-cdk-python.html) installed | ||
* [AWS Cloud Development Kit](https://docs.aws.amazon.com/cdk/v2/guide/getting_started.html) (AWS CDK >= 2.2.0) Installed | ||
|
||
## Deployment Instructions | ||
|
||
1. Clone the project to your local working directory | ||
|
||
```sh | ||
git clone https://github.com/aws-samples/serverless-patterns | ||
``` | ||
|
||
2. Change to the pattern directory: | ||
```sh | ||
cd serverless-patterns/apigw-lambda-sagemaker-jumpstartendpoint-cdk-python | ||
``` | ||
3. Create and activate the project's virtual environment. This allows the project's dependencies to be installed locally in the project folder, instead of globally. Note that if you have multiple versions of Python installed, where the `python` command references Python 2.x, then you can reference Python 3.x by using the `python3` command. You can check which version of Python is being referenced by running the command `python --version` or `python3 --version` | ||
|
||
```sh | ||
python3 -m venv .venv | ||
source .venv/bin/activate | ||
``` | ||
|
||
4. Install the project dependencies | ||
|
||
```sh | ||
python -m pip install -r requirements.txt | ||
``` | ||
|
||
5. Deploy the stack to your default AWS account and region. | ||
|
||
```sh | ||
cdk deploy | ||
``` | ||
|
||
6. The default instance count for inference is set to 1. The instance count can be changed by passing the instance_count_param. | ||
|
||
```sh | ||
cdk deploy --context instance_count_param=2 | ||
``` | ||
|
||
|
||
## How it works | ||
|
||
1. This pattern deploys a SageMaker Jumpstart model (Flan T5 XL from HuggingFace) endpoint using Amazon SageMaker. The model can be changed by modifying the ```MODEL_ID``` attribute in app.py file. | ||
|
||
2. The pattern also adds a lambda and an API Gateway query the endpoint. | ||
|
||
3. The API Gateway is protected using an API Key. To query the Api Gateway, ```x-api-key``` header needs to be added to the HTTP request. | ||
|
||
|
||
## Testing | ||
|
||
1. Retrieve the Host URL of the API Gateway from AWS Console. | ||
|
||
2. Retrieve the API key from AWS Console. | ||
|
||
3. Send a sample HTTP request to the API Gateway: | ||
``` | ||
POST /prod/generateimage HTTP/1.1 | ||
Host: <Host URL of the API Gateway> | ||
x-api-key: <Retreive the key from AWS Console> | ||
Content-Type: application/json | ||
Cache-Control: no-cache | ||
{ | ||
"query": { | ||
"text_inputs": "A step by step recipe to make butter chicken:", | ||
"max_length": 5000 | ||
} | ||
} | ||
``` | ||
|
||
4. The API Gateway should respond with a JSON formatted response from the SageMaker endpoint | ||
|
||
|
||
## Cleanup | ||
|
||
Run the given command to delete the resources that were created. It might take some time for the CloudFormation stack to get deleted. | ||
|
||
```sh | ||
cdk destroy | ||
``` | ||
|
||
## Author bio | ||
Kaustav Dey, | ||
https://www.linkedin.com/in/kaustavbecs/ | ||
Solution Architect | ||
|
||
---- | ||
Copyright 2023 Amazon.com, Inc. or its affiliates. All Rights Reserved. | ||
|
||
SPDX-License-Identifier: MIT-0 |
37 changes: 37 additions & 0 deletions
37
apigw-lambda-sagemaker-jumpstartendpoint-cdk-python/app.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
#!/usr/bin/env python3 | ||
import aws_cdk as cdk | ||
import boto3 | ||
from stack.ApigwLambdaSagemakerJumpstartendpointStack import ApigwLambdaSagemakerJumpstartendpointStack | ||
from util.sagemaker_util import * | ||
|
||
region_name = boto3.Session().region_name | ||
env = {"region": region_name} | ||
|
||
|
||
# Obtain the model ID from: https://sagemaker.readthedocs.io/en/v2.173.0/doc_utils/pretrainedmodels.html | ||
# Here we are using Flan T5 XL Model | ||
MODEL_ID = "huggingface-text2text-flan-t5-xl" | ||
|
||
# Change the instance type to match your model. | ||
# For GPU Service Quota related issues, please raise a quota request from AWS console | ||
INFERENCE_INSTANCE_TYPE = "ml.g5.2xlarge" | ||
|
||
# Name of the stack: | ||
STACK_NAME = "apigw-lambda-sagemaker-jumpstartendpoint-stack" | ||
|
||
|
||
|
||
MODEL_INFO = get_sagemaker_uris(model_id=MODEL_ID, | ||
instance_type=INFERENCE_INSTANCE_TYPE, | ||
region_name=region_name) | ||
|
||
app = cdk.App() | ||
|
||
stack = ApigwLambdaSagemakerJumpstartendpointStack( | ||
app, | ||
STACK_NAME, | ||
model_info=MODEL_INFO, | ||
env=env | ||
) | ||
|
||
app.synth() |
52 changes: 52 additions & 0 deletions
52
apigw-lambda-sagemaker-jumpstartendpoint-cdk-python/cdk.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
{ | ||
"app": "python3 app.py", | ||
"watch": { | ||
"include": [ | ||
"**" | ||
], | ||
"exclude": [ | ||
"README.md", | ||
"cdk*.json", | ||
"requirements*.txt", | ||
"source.bat", | ||
"**/__init__.py", | ||
"python/__pycache__", | ||
"tests" | ||
] | ||
}, | ||
"context": { | ||
"@aws-cdk/aws-lambda:recognizeLayerVersion": true, | ||
"@aws-cdk/core:checkSecretUsage": true, | ||
"@aws-cdk/core:target-partitions": [ | ||
"aws", | ||
"aws-cn" | ||
], | ||
"@aws-cdk-containers/ecs-service-extensions:enableDefaultLogDriver": true, | ||
"@aws-cdk/aws-ec2:uniqueImdsv2TemplateName": true, | ||
"@aws-cdk/aws-ecs:arnFormatIncludesClusterName": true, | ||
"@aws-cdk/aws-iam:minimizePolicies": true, | ||
"@aws-cdk/core:validateSnapshotRemovalPolicy": true, | ||
"@aws-cdk/aws-codepipeline:crossAccountKeyAliasStackSafeResourceName": true, | ||
"@aws-cdk/aws-s3:createDefaultLoggingPolicy": true, | ||
"@aws-cdk/aws-sns-subscriptions:restrictSqsDescryption": true, | ||
"@aws-cdk/aws-apigateway:disableCloudWatchRole": true, | ||
"@aws-cdk/core:enablePartitionLiterals": true, | ||
"@aws-cdk/aws-events:eventsTargetQueueSameAccount": true, | ||
"@aws-cdk/aws-iam:standardizedServicePrincipals": true, | ||
"@aws-cdk/aws-ecs:disableExplicitDeploymentControllerForCircuitBreaker": true, | ||
"@aws-cdk/aws-iam:importedRoleStackSafeDefaultPolicyName": true, | ||
"@aws-cdk/aws-s3:serverAccessLogsUseBucketPolicy": true, | ||
"@aws-cdk/aws-route53-patters:useCertificate": true, | ||
"@aws-cdk/customresources:installLatestAwsSdkDefault": false, | ||
"@aws-cdk/aws-rds:databaseProxyUniqueResourceName": true, | ||
"@aws-cdk/aws-codedeploy:removeAlarmsFromDeploymentGroup": true, | ||
"@aws-cdk/aws-apigateway:authorizerChangeDeploymentLogicalId": true, | ||
"@aws-cdk/aws-ec2:launchTemplateDefaultUserData": true, | ||
"@aws-cdk/aws-secretsmanager:useAttachedSecretResourcePolicyForSecretTargetAttachments": true, | ||
"@aws-cdk/aws-redshift:columnId": true, | ||
"@aws-cdk/aws-stepfunctions-tasks:enableEmrServicePolicyV2": true, | ||
"@aws-cdk/aws-ec2:restrictDefaultSecurityGroup": true, | ||
"@aws-cdk/aws-apigateway:requestValidatorUniqueId": true, | ||
"@aws-cdk/aws-kms:aliasNameRef": true | ||
} | ||
} |
58 changes: 58 additions & 0 deletions
58
apigw-lambda-sagemaker-jumpstartendpoint-cdk-python/example-pattern.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
{ | ||
"title": "Accessing Amazon SageMaker Endpoint via API Gateway and Lambda", | ||
"description": "This pattern deploys a SageMaker Jumpstart model (Flan T5 XL) endpoint. It also uses a Lambda function and API Gateway to integrate the endpoint", | ||
"language": "Python", | ||
"level": "300", | ||
"framework": "CDK", | ||
"introBox": { | ||
"headline": "How it works", | ||
"text": [ | ||
"This pattern deploys a Amazon SageMaker Jumpstart model (Flan T5 XL from HuggingFace) endpoint using Amazon SageMaker. The model can be changed by modifying the MODEL_ID attribute in app.py file.", | ||
"The pattern also adds a Lambda function with API Gateway integration to query the endpoint.", | ||
"The API Gateway is protected using an API Key. To query the API Gateway endpoint, x-api-key header needs to be added to the HTTP request." | ||
] | ||
}, | ||
"gitHub": { | ||
"template": { | ||
"repoURL": "https://github.com/aws-samples/serverless-patterns/tree/main/apigw-lambda-sagemaker-jumpstartendpoint-cdk-python", | ||
"templateURL": "serverless-patterns/apigw-lambda-sagemaker-jumpstartendpoint-cdk-python", | ||
"projectFolder": "apigw-lambda-sagemaker-jumpstartendpoint-cdk-python", | ||
"templateFile": "stack/ApigwLambdaSagemakerJumpstartendpointStack.py" | ||
} | ||
}, | ||
"resources": { | ||
"bullets": [ | ||
{ | ||
"text": "Zero-shot prompting for the Flan-T5 foundation model in Amazon SageMaker JumpStart", | ||
"link": "https://aws.amazon.com/blogs/machine-learning/zero-shot-prompting-for-the-flan-t5-foundation-model-in-amazon-sagemaker-jumpstart/" | ||
}, | ||
{ | ||
"text": "Instruction fine-tuning for FLAN T5 XL with Amazon SageMaker Jumpstart", | ||
"link": "https://aws.amazon.com/blogs/machine-learning/instruction-fine-tuning-for-flan-t5-xl-with-amazon-sagemaker-jumpstart/" | ||
} | ||
] | ||
}, | ||
"deploy": { | ||
"text": [ | ||
"cdk deploy" | ||
] | ||
}, | ||
"testing": { | ||
"text": [ | ||
"See the GitHub repo for detailed testing instructions." | ||
] | ||
}, | ||
"cleanup": { | ||
"text": [ | ||
"Delete the stack: <code>cdk destroy</code>." | ||
] | ||
}, | ||
"authors": [ | ||
{ | ||
"name": "Kaustav Dey", | ||
"image": "https://avatars.githubusercontent.com/u/13236519", | ||
"bio": "Solution Architect at AWS", | ||
"linkedin": "https://www.linkedin.com/in/kaustavbecs/" | ||
} | ||
] | ||
} |
35 changes: 35 additions & 0 deletions
35
apigw-lambda-sagemaker-jumpstartendpoint-cdk-python/lambda/InvokeSagemakerEndpointLambda.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
import json | ||
import os | ||
import boto3 | ||
|
||
|
||
def lambda_handler(event, context): | ||
|
||
# Extract the input data from the request body | ||
input_payload = json.loads(event['body'])['query'] | ||
|
||
# Fetch the endpoint name from the environment variable | ||
endpoint_name = os.environ['SAGEMAKER_ENDPOINT_NAME'] | ||
|
||
# Create a SageMaker runtime client | ||
runtime = boto3.client('sagemaker-runtime') | ||
|
||
try: | ||
"""Query the SageMaker endpoint.""" | ||
payload = json.dumps(input_payload) | ||
response = runtime.invoke_endpoint(EndpointName=endpoint_name, | ||
ContentType='application/json', | ||
Body=payload) | ||
query_response = response['Body'].read().decode('utf-8') | ||
|
||
# Return the result as the Lambda function response | ||
return { | ||
'statusCode': 200, | ||
'body': query_response | ||
} | ||
except Exception as e: | ||
# Handle any errors that occur during the invocation | ||
return { | ||
'statusCode': 500, | ||
'body': str(e) | ||
} |
5 changes: 5 additions & 0 deletions
5
apigw-lambda-sagemaker-jumpstartendpoint-cdk-python/requirements.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
aws-cdk-lib==2.83.1 | ||
constructs>=10.0.0,<11.0.0 | ||
requests | ||
boto3 | ||
sagemaker |
13 changes: 13 additions & 0 deletions
13
apigw-lambda-sagemaker-jumpstartendpoint-cdk-python/source.bat
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
@echo off | ||
|
||
rem The sole purpose of this script is to make the command | ||
rem | ||
rem source .venv/bin/activate | ||
rem | ||
rem (which activates a Python virtualenv on Linux or Mac OS X) work on Windows. | ||
rem On Windows, this command just runs this batch file (the argument is ignored). | ||
rem | ||
rem Now we don't need to document a Windows command for activating a virtualenv. | ||
|
||
echo Executing .venv\Scripts\activate.bat for you | ||
.venv\Scripts\activate.bat |
Oops, something went wrong.