Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS Reference Implementation #24

Open
wants to merge 63 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
2071aec
Creating GCP folder and gitignore, also testing first test script
lotif Jun 3, 2024
68c2fef
Splitting into steps, adding a machine too, needs a python api
lotif Jun 4, 2024
48d9929
Making docker with sample python app
lotif Jun 4, 2024
bdf8787
Almost working
lotif Jun 4, 2024
eaf5ecf
Actually working now using a different tactic
lotif Jun 5, 2024
9805c6d
Finished step 1
lotif Jun 6, 2024
eb9e2b2
moving script to ml-api
lotif Jun 6, 2024
7258680
Finished step 2
lotif Jun 11, 2024
20338bc
Working on GCP
lotif Jun 11, 2024
7e59d81
Better code
lotif Jun 12, 2024
8d0248e
Even better code
lotif Jun 12, 2024
865927c
Fixing small bug
lotif Jun 12, 2024
2976ef3
Updating gitignore
lotif Jun 14, 2024
eebb770
Step 3 working
lotif Jun 14, 2024
50cc292
Fixes
lotif Jun 14, 2024
ba748e2
Fixing VM permissions
lotif Jun 17, 2024
80f005b
Adding autoscaling config
lotif Jun 17, 2024
a3a47e7
Better gitignore
lotif Jun 17, 2024
a2c821a
WIP feature store
lotif Jun 19, 2024
df3c2b6
ML API pulling data from the feature store
lotif Jun 19, 2024
a6b77d3
Updating requirements
lotif Jun 19, 2024
149b39c
Adding step5
lotif Jun 19, 2024
80fbc76
Script to upload a new model version
lotif Jun 19, 2024
6b27216
Adding the undeploy script
lotif Jun 19, 2024
ac27a71
WIP deploy script
lotif Jun 20, 2024
15e3670
Working script to update new model version
lotif Jun 20, 2024
7bb68c5
Setting up model monitoring
lotif Jun 24, 2024
dde324f
Small typo on readme
lotif Jul 10, 2024
ff85a5d
Removing outdated code
lotif Jul 10, 2024
f2e2c03
Step1 to step3 for AWS
rjavadi Aug 2, 2024
44317ed
Tidy up variables
rjavadi Aug 7, 2024
8c864bb
Added support for deployment on Inferentia chips
rjavadi Aug 21, 2024
99ac08e
Upload and deploy model to AWS Inferentia
rjavadi Aug 23, 2024
3b73abe
Add terraform support to deploy endpoint
rjavadi Aug 30, 2024
d41ce9a
Fixed model errors, change pytorch ecr image
rjavadi Sep 4, 2024
a3b2af5
Update gitignore
rjavadi Sep 4, 2024
bca2a04
Added support for lambda and api gateway
rjavadi Sep 4, 2024
1bbd259
update inference code to return logits
rjavadi Sep 4, 2024
58fca3b
added my user info
rjavadi Sep 9, 2024
17be235
Merge branch 'main' into aws-impl
rjavadi Sep 9, 2024
d82bcb2
Manually add feature store (step 4)
rjavadi Sep 12, 2024
b239bf5
Add Redshift
rjavadi Sep 12, 2024
7731733
Lambda function uses feature store
rjavadi Sep 14, 2024
9342edf
Fix Lambda feature store bug
rjavadi Sep 23, 2024
f1ddb1c
Add initial SQS send/receive function
rjavadi Oct 3, 2024
56a806c
Update step4 readme for feature store
rjavadi Oct 3, 2024
e956804
Merge main
rjavadi Oct 8, 2024
1b94f93
Fix APIGW permissions
rjavadi Oct 22, 2024
d7a8268
Update readmes
rjavadi Oct 22, 2024
6a32478
Merge branch 'main' into aws-impl
rjavadi Oct 22, 2024
57ec334
offline inference - writing to redshift db
rjavadi Oct 30, 2024
963193d
Fix SQL errors
rjavadi Nov 4, 2024
d16c24f
Get id from args
rjavadi Nov 4, 2024
9b88d77
Fix inference code
rjavadi Nov 8, 2024
0945f67
Fixed some bugs and added metric monitoring
rjavadi Nov 8, 2024
7632aa4
Clean up
rjavadi Nov 8, 2024
d9948b3
clean up
rjavadi Nov 8, 2024
e2a8016
Undo change to tfvars
rjavadi Nov 8, 2024
7b2635e
Rename resources starting with my_*
rjavadi Dec 3, 2024
a66dc14
Resolve formatting issues with Ruff
rjavadi Dec 3, 2024
3a4de44
Fix formatting
rjavadi Dec 3, 2024
da18d52
Add architecture diagrams and update readme
rjavadi Dec 4, 2024
bca6dc6
Add prefix to resource name
rjavadi Dec 10, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,15 @@
*.terraform/
*terraform.tfstate*
*.terraform.tfstate.lock.info
*terraform.tfstate.backup

*bart-large-mnli/
*ml-api.zip
/.idea

*venv/
*bart-large-mnli/
*paraphrase-bert/

__pycache__/
*.zip
4 changes: 4 additions & 0 deletions reference_implementations/aws/offline/01_provider.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
provider "aws" {
region = var.region
profile = var.default_profile
}
140 changes: 140 additions & 0 deletions reference_implementations/aws/offline/02_sagemaker_execution_roles.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
resource "aws_iam_role" "sagemaker_execution_role" {
name = "${local.prefix}-SagemakerModelExecutionRole"

assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "sagemaker.amazonaws.com"
}
}
]
})
}

resource "aws_iam_policy" "sagemaker_execution_role_policy" {
name = "${local.prefix}-sagemaker-execution-role-policy"
description = "Policy for SageMaker model"
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{ # models and endpoint access
Action = [
"sagemaker:CreateModel",
"sagemaker:CreateEndpointConfig",
"sagemaker:CreateEndpoint",
"sagemaker:DeleteEndpoint",
"sagemaker:InvokeEndpoint",
"sagemaker:UpdateEndpoint",
"sagemaker:StopEndpoint",
"sagemaker:DeleteEndpointConfig",
"sagemaker:DeleteModel",
"sagemaker:DescribeEndpoint",
"sagemaker:DescribeEndpointConfig",
"sagemaker:DescribeModel",
"sagemaker:AddTags"
]
Effect = "Allow"
Resource = [
"arn:aws:sagemaker:${var.region}:${local.aws_account_id}:endpoint-config/*",
"arn:aws:sagemaker:${var.region}:${local.aws_account_id}:model/*",
"arn:aws:sagemaker:${var.region}:${local.aws_account_id}:endpoint/*",
"arn:aws:sagemaker:${var.region}:${local.aws_account_id}:app/*"
]
},
{
Action = [
"s3:GetObject",
"s3:PutObject",
"s3:ListBucket",
"s3:DeleteObject"
]
Effect = "Allow"
# TODO: give permission to access S3 buckets - replace with your bucket names
Resource = [
"arn:aws:s3:::sagemaker-endpoint-deploy-tf-state-vector",
"arn:aws:s3:::sagemaker-endpoint-deploy-tf-state-vector/*",
"arn:aws:s3:::sagemaker-us-east-1-025066243062",
"arn:aws:s3:::sagemaker-us-east-1-025066243062/*"
]
},
{ # reading/writing logs
Action = [
"logs:CreateLogDelivery",
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:DeleteLogDelivery",
"logs:Describe*",
"logs:GetLogDelivery",
"logs:GetLogEvents",
"logs:ListLogDeliveries",
"logs:PutLogEvents",
"logs:DescribeLogStreams",
"logs:DescribeLogGroups",
"logs:PutResourcePolicy",
"logs:UpdateLogDelivery",
"logs:FilterLogEvents"
]
Effect = "Allow"
Resource = "*"
},
{ # cloud watch
Action = [
"cloudwatch:DeleteAlarms",
"cloudwatch:DescribeAlarms",
"cloudwatch:GetMetricData",
"cloudwatch:GetMetricStatistics",
"cloudwatch:ListMetrics",
"cloudwatch:PutMetricAlarm",
"cloudwatch:PutMetricData"
]
Effect = "Allow"
Resource = "*"
},
{
"Effect" : "Allow",
"Action" : "ecr:GetAuthorizationToken",
"Resource" : "*"
},
{
Action = [
"ecr:BatchCheckLayerAvailability",
"ecr:GetDownloadUrlForLayer",
"ecr:GetRepositoryPolicy",
"ecr:DescribeRepositories",
"ecr:ListImages",
"ecr:DescribeImages",
"ecr:BatchGetImage",
"ecr:GetLifecyclePolicy",
"ecr:GetLifecyclePolicyPreview",
"ecr:ListTagsForResource",
"ecr:DescribeImageScanFindings"
]
Effect = "Allow"
Resource = [
"arn:aws:ecr:${var.region}:763104351884:repository/*",
]
},
{ # feature store
Action = [
"glue:GetTable",
"glue:UpdateTable"
],
Effect = "Allow",
Resource = [
"arn:aws:glue:*:*:catalog",
"arn:aws:glue:*:*:database/sagemaker_featurestore",
"arn:aws:glue:*:*:table/sagemaker_featurestore/*"
]
}
]
})
}

resource "aws_iam_role_policy_attachment" "sagemaker_execution_role_policy_attachment" {
policy_arn = aws_iam_policy.sagemaker_execution_role_policy.arn
role = aws_iam_role.sagemaker_execution_role.name
}
42 changes: 42 additions & 0 deletions reference_implementations/aws/offline/03_endpoint.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
resource "aws_sagemaker_model" "paraphrase_model" {
name = "${local.prefix}-${var.sagemaker_model_name}"
execution_role_arn = aws_iam_role.sagemaker_execution_role.arn
container {
image = var.sagemaker_container_repo_url
model_data_url = var.sagemaker_model_data_s3_url
mode = var.sagemaker_model_mode
environment = {
"SAGEMAKER_CONTAINER_LOG_LEVEL" = "20"
"SAGEMAKER_PROGRAM" = "inference.py"
"SAGEMAKER_REGION" = "${var.region}"
"SAGEMAKER_SUBMIT_DIRECTORY" = "/opt/ml/model"
}

}
}

resource "aws_sagemaker_endpoint_configuration" "ec" {
name = "${local.prefix}-${var.sagemaker_endpoint_conf_name}"

production_variants {
variant_name = var.sagemaker_endpoint_conf_variant_name
model_name = aws_sagemaker_model.paraphrase_model.name
initial_instance_count = var.sagemaker_model_instance_count
instance_type = var.sagemaker_model_instance_type
}

tags = merge(
local.common_tags,
{ "Name" = "${local.prefix}-${var.sagemaker_endpoint_conf_name}" }
)
}

resource "aws_sagemaker_endpoint" "paraphrase_endpoint" {
name = "${local.prefix}-${var.sagemaker_endpoint_name}"
endpoint_config_name = aws_sagemaker_endpoint_configuration.ec.name

tags = merge(
local.common_tags,
{ "Name" = "${local.prefix}-${var.sagemaker_endpoint_name}" }
)
}
145 changes: 145 additions & 0 deletions reference_implementations/aws/offline/04_lambda.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
resource "aws_iam_role" "lambda_role" {
name = "${local.prefix}-BertParaphraseModelLambdaRoleTF"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "lambda.amazonaws.com"
}
}
]
})
}

# aws_iam_role_policy: For creating inline, role-specific policies.
resource "aws_iam_role_policy" "lambda_logs_policy" {
name = "${local.prefix}-lambda_role_logs_policy"
role = aws_iam_role.lambda_role.id

policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:DescribeLogGroups",
"logs:DescribeLogStreams",
"logs:PutLogEvents",
"logs:GetLogEvents",
"logs:FilterLogEvents"
],
Resource = "*"
}
]
})
}

resource "aws_iam_role_policy" "lambda_sagemaker_policy" {
name = "${local.prefix}-lambda_role_sagemaker_policy"
role = aws_iam_role.lambda_role.id

policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = "sagemaker:InvokeEndpoint"
Resource = "arn:aws:sagemaker:${var.region}:${local.aws_account_id}:endpoint/*"
}
]
})
}

resource "aws_iam_role_policy" "lambda_sagemaker_featurestore_policy" {
name = "${local.prefix}-lambda_role_sagemaker_featurestore_policy"
role = aws_iam_role.lambda_role.id

policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = [
"sagemaker:GetRecord",
"sagemaker:PutRecord",
"sagemaker:ListFeatureGroups",
"sagemaker:BatchGetRecord"
]
Resource = "arn:aws:sagemaker:${var.region}:${local.aws_account_id}:feature-group/*"
}
]
})
}

# Attach AmazonRedshiftDataFullAccess policy to the role
# aws_iam_policy_attachment: For attaching existing managed policies
# (either AWS-managed or your own custom policies) to roles, users, or groups.
resource "aws_iam_policy_attachment" "redshift_data_access" {
name = "${local.prefix}-lambda_role_redshift_data_access_attachment"
roles = [aws_iam_role.lambda_role.id]
policy_arn = "arn:aws:iam::aws:policy/AmazonRedshiftFullAccess"
}

resource "aws_iam_role_policy" "lambda_sqs_policy" {
role = aws_iam_role.lambda_role.id

policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = [
"sqs:SendMessage",
"sqs:ReceiveMessage",
"sqs:DeleteMessage",
"sqs:GetQueueAttributes" # Add this line
]
Resource = aws_sqs_queue.inference_queue.arn
}
]
})
}

resource "aws_iam_role_policy" "cloudwatch_put_metric_data_policy" {
role = aws_iam_role.lambda_role.id

policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = "cloudwatch:PutMetricData"
Resource = "*"
}
]
})
}

resource "aws_lambda_function" "inference_lambda_function" {
filename = "./lambda.zip"
function_name = "${local.prefix}-bert-paraphrase-tf"
role = aws_iam_role.lambda_role.arn
handler = "lambda_function.lambda_handler"
runtime = "python3.8"
source_code_hash = filebase64sha256("lambda.zip")
timeout = 300

layers = [
"arn:aws:lambda:us-east-1:017000801446:layer:AWSLambdaPowertoolsPythonV2:38"
]
environment {
variables = {
ENDPOINT_NAME = "${aws_sagemaker_endpoint.paraphrase_endpoint.name}"
FEATURE_GROUP_NAME = "${aws_sagemaker_feature_group.paraphrase_fg.feature_group_name}"
REDSHIFT_URL = "${aws_redshift_cluster.redshift_feature_store.endpoint}:${aws_redshift_cluster.redshift_feature_store.port}"
REDSHIFT_USER = "${aws_redshift_cluster.redshift_feature_store.master_username}"
CLUSTER_ID = "${aws_redshift_cluster.redshift_feature_store.id}"
DB_NAME = "${var.db_name}"
}
}
}
Loading