-
Notifications
You must be signed in to change notification settings - Fork 21
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #85 from ifeojo/stepfunction
feat: Add StepfunctionsStack for agent evaluation workflow
- Loading branch information
Showing
34 changed files
with
1,291 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -170,3 +170,4 @@ cython_debug/ | |
|
||
local-test | ||
local-test/* | ||
.DS_Store |
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
*.swp | ||
package-lock.json | ||
__pycache__ | ||
.pytest_cache | ||
.venv | ||
*.egg-info | ||
*.dist-info | ||
|
||
# CDK asset staging directory | ||
.cdk.staging | ||
cdk.out | ||
.DS_Store |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,120 @@ | ||
# Bedrock Agent Evaluation Framework | ||
|
||
This project implements an automated evaluation framework for Amazon Bedrock Agents using AWS CDK, Step Functions, and Lambda. | ||
|
||
## Overview | ||
|
||
The framework automates the process of updating Bedrock Agents with new prompts, creating aliases, running evaluation scenarios, and cleaning up resources. It uses AWS Step Functions to orchestrate the workflow and AWS Lambda functions to perform individual tasks. | ||
|
||
The example provided is for an energy chatbot usecase | ||
|
||
## Components | ||
|
||
1. **CDK Stack (StepfunctionsStack)**: Defines the infrastructure, including Lambda functions, Step Functions state machine, and associated IAM roles. | ||
|
||
2. **Lambda Functions**: | ||
- `generate_map`: Generates evaluation scenarios from S3 input. | ||
- `check_agent_status_1` and `check_agent_status_2`: Check the status of Bedrock Agents. | ||
- `update_bedrock_agent`: Updates the Bedrock Agent with new instructions. | ||
- `create_alias`: Creates an alias for the updated agent. | ||
- `run_test`: Executes evaluation scenarios using the `agenteval` library. | ||
- `delete_alias`: Removes the temporary alias after evaluation. | ||
|
||
3. **Step Functions State Machine**: Orchestrates the evaluation workflow, including agent updates, status checks, and scenario execution. | ||
|
||
4. **S3 Bucket**: Stores evaluation prompts and results. | ||
|
||
5. **EventBridge Rule**: Triggers the Step Functions workflow when new evaluation prompts are uploaded to S3. | ||
|
||
## Workflow | ||
|
||
1. New evaluation prompts are uploaded to the S3 bucket. | ||
2. The EventBridge rule triggers the Step Functions state machine. | ||
3. The state machine updates the Bedrock Agent with new instructions. | ||
4. An alias is created for the updated agent. | ||
5. Evaluation scenarios are executed using the `agenteval` library. | ||
6. Results are stored in the S3 bucket. | ||
7. The temporary alias is deleted. | ||
|
||
## Setup and Deployment | ||
|
||
1. Ensure you have the AWS CDK installed and configured. | ||
2. Install project dependencies: | ||
``` | ||
npm install | ||
``` | ||
3. Deploy the stack: | ||
``` | ||
cdk deploy | ||
``` | ||
|
||
## Usage | ||
|
||
To run an evaluation: | ||
|
||
1. Prepare an evaluation JSON file with prompts and customer profiles. | ||
2. Upload the file to the S3 bucket in the `evaluation_prompts/` prefix. | ||
3. The evaluation process will start automatically. | ||
4. Results will be available in the S3 bucket under the `results/` prefix. | ||
|
||
## Notes | ||
|
||
- Ensure proper IAM permissions are set up for accessing Bedrock, S3, and other AWS services. | ||
- The `agenteval` library is assumed to be provided as a custom Lambda layer. | ||
|
||
|
||
# CDK instructions | ||
|
||
The `cdk.json` file tells the CDK Toolkit how to execute your app. | ||
|
||
This project is set up like a standard Python project. The initialization | ||
process also creates a virtualenv within this project, stored under the `.venv` | ||
directory. To create the virtualenv it assumes that there is a `python3` | ||
(or `python` for Windows) executable in your path with access to the `venv` | ||
package. If for any reason the automatic creation of the virtualenv fails, | ||
you can create the virtualenv manually. | ||
|
||
To manually create a virtualenv on MacOS and Linux: | ||
|
||
``` | ||
$ python3 -m venv .venv | ||
``` | ||
|
||
After the init process completes and the virtualenv is created, you can use the following | ||
step to activate your virtualenv. | ||
|
||
``` | ||
$ source .venv/bin/activate | ||
``` | ||
|
||
If you are a Windows platform, you would activate the virtualenv like this: | ||
|
||
``` | ||
% .venv\Scripts\activate.bat | ||
``` | ||
|
||
Once the virtualenv is activated, you can install the required dependencies. | ||
|
||
``` | ||
$ pip install -r requirements.txt | ||
``` | ||
|
||
At this point you can now synthesize the CloudFormation template for this code. | ||
|
||
``` | ||
$ cdk synth | ||
``` | ||
|
||
To add additional dependencies, for example other CDK libraries, just add | ||
them to your `setup.py` file and rerun the `pip install -r requirements.txt` | ||
command. | ||
|
||
## Useful commands | ||
|
||
* `cdk ls` list all stacks in the app | ||
* `cdk synth` emits the synthesized CloudFormation template | ||
* `cdk deploy` deploy this stack to your default AWS account/region | ||
* `cdk diff` compare deployed stack with current state | ||
* `cdk docs` open CDK documentation | ||
|
||
Enjoy! |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
#!/usr/bin/env python3 | ||
import os | ||
|
||
import aws_cdk as cdk | ||
|
||
from stepfunctions.stepfunctions_stack import StepfunctionsStack | ||
|
||
|
||
app = cdk.App() | ||
StepfunctionsStack(app, "StepfunctionsStack", | ||
# If you don't specify 'env', this stack will be environment-agnostic. | ||
# Account/Region-dependent features and context lookups will not work, | ||
# but a single synthesized template can be deployed anywhere. | ||
|
||
# Uncomment the next line to specialize this stack for the AWS Account | ||
# and Region that are implied by the current CLI configuration. | ||
|
||
#env=cdk.Environment(account=os.getenv('CDK_DEFAULT_ACCOUNT'), region=os.getenv('CDK_DEFAULT_REGION')), | ||
|
||
# Uncomment the next line if you know exactly what Account and Region you | ||
# want to deploy the stack to. */ | ||
|
||
# env=cdk.Environment(region='us-east-1'), | ||
|
||
# For more information, see https://docs.aws.amazon.com/cdk/latest/guide/environments.html | ||
) | ||
|
||
app.synth() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
{ | ||
"app": "python3 app.py", | ||
"watch": { | ||
"include": [ | ||
"**" | ||
], | ||
"exclude": [ | ||
"README.md", | ||
"cdk*.json", | ||
"requirements*.txt", | ||
"source.bat", | ||
"**/__init__.py", | ||
"**/__pycache__", | ||
"tests" | ||
] | ||
}, | ||
"context": { | ||
"@aws-cdk/aws-lambda:recognizeLayerVersion": true, | ||
"@aws-cdk/core:checkSecretUsage": true, | ||
"@aws-cdk/core:target-partitions": [ | ||
"aws", | ||
"aws-cn" | ||
], | ||
"@aws-cdk-containers/ecs-service-extensions:enableDefaultLogDriver": true, | ||
"@aws-cdk/aws-ec2:uniqueImdsv2TemplateName": true, | ||
"@aws-cdk/aws-ecs:arnFormatIncludesClusterName": true, | ||
"@aws-cdk/aws-iam:minimizePolicies": true, | ||
"@aws-cdk/core:validateSnapshotRemovalPolicy": true, | ||
"@aws-cdk/aws-codepipeline:crossAccountKeyAliasStackSafeResourceName": true, | ||
"@aws-cdk/aws-s3:createDefaultLoggingPolicy": true, | ||
"@aws-cdk/aws-sns-subscriptions:restrictSqsDescryption": true, | ||
"@aws-cdk/aws-apigateway:disableCloudWatchRole": true, | ||
"@aws-cdk/core:enablePartitionLiterals": true, | ||
"@aws-cdk/aws-events:eventsTargetQueueSameAccount": true, | ||
"@aws-cdk/aws-ecs:disableExplicitDeploymentControllerForCircuitBreaker": true, | ||
"@aws-cdk/aws-iam:importedRoleStackSafeDefaultPolicyName": true, | ||
"@aws-cdk/aws-s3:serverAccessLogsUseBucketPolicy": true, | ||
"@aws-cdk/aws-route53-patters:useCertificate": true, | ||
"@aws-cdk/customresources:installLatestAwsSdkDefault": false, | ||
"@aws-cdk/aws-rds:databaseProxyUniqueResourceName": true, | ||
"@aws-cdk/aws-codedeploy:removeAlarmsFromDeploymentGroup": true, | ||
"@aws-cdk/aws-apigateway:authorizerChangeDeploymentLogicalId": true, | ||
"@aws-cdk/aws-ec2:launchTemplateDefaultUserData": true, | ||
"@aws-cdk/aws-secretsmanager:useAttachedSecretResourcePolicyForSecretTargetAttachments": true, | ||
"@aws-cdk/aws-redshift:columnId": true, | ||
"@aws-cdk/aws-stepfunctions-tasks:enableEmrServicePolicyV2": true, | ||
"@aws-cdk/aws-ec2:restrictDefaultSecurityGroup": true, | ||
"@aws-cdk/aws-apigateway:requestValidatorUniqueId": true, | ||
"@aws-cdk/aws-kms:aliasNameRef": true, | ||
"@aws-cdk/aws-autoscaling:generateLaunchTemplateInsteadOfLaunchConfig": true, | ||
"@aws-cdk/core:includePrefixInUniqueNameGeneration": true, | ||
"@aws-cdk/aws-efs:denyAnonymousAccess": true, | ||
"@aws-cdk/aws-opensearchservice:enableOpensearchMultiAzWithStandby": true, | ||
"@aws-cdk/aws-lambda-nodejs:useLatestRuntimeVersion": true, | ||
"@aws-cdk/aws-efs:mountTargetOrderInsensitiveLogicalId": true, | ||
"@aws-cdk/aws-rds:auroraClusterChangeScopeOfInstanceParameterGroupWithEachParameters": true, | ||
"@aws-cdk/aws-appsync:useArnForSourceApiAssociationIdentifier": true, | ||
"@aws-cdk/aws-rds:preventRenderingDeprecatedCredentials": true, | ||
"@aws-cdk/aws-codepipeline-actions:useNewDefaultBranchForCodeCommitSource": true, | ||
"@aws-cdk/aws-cloudwatch-actions:changeLambdaPermissionLogicalIdForLambdaAction": true, | ||
"@aws-cdk/aws-codepipeline:crossAccountKeysDefaultValueToFalse": true, | ||
"@aws-cdk/aws-codepipeline:defaultPipelineTypeToV2": true, | ||
"@aws-cdk/aws-kms:reduceCrossAccountRegionPolicyScope": true, | ||
"@aws-cdk/aws-eks:nodegroupNameAttribute": true, | ||
"@aws-cdk/aws-ec2:ebsDefaultGp3Volume": true, | ||
"@aws-cdk/aws-ecs:removeDefaultDeploymentAlarm": true, | ||
"@aws-cdk/custom-resources:logApiResponseDataPropertyTrueDefault": false, | ||
"@aws-cdk/aws-s3:keepNotificationInImportedBucket": false | ||
} | ||
} |
Oops, something went wrong.