-
Notifications
You must be signed in to change notification settings - Fork 330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not create SageMaker and StepFunctions endpoints when not needed #511
base: main
Are you sure you want to change the base?
Conversation
lib/shared/index.ts
Outdated
service: ec2.InterfaceVpcEndpointAwsService.SAGEMAKER_RUNTIME, | ||
open: true, | ||
}); | ||
if (props.config.llms.sagemaker.length > 0){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this account for the fact that the re-ranker endpoint is always running when rag is enabled? even if no specific sagemaker models are chosen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that should be the reason
service: ec2.InterfaceVpcEndpointAwsService.SAGEMAKER_RUNTIME, | ||
open: true, | ||
}); | ||
if (props.config.llms.sagemaker.length > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (props.config.llms.sagemaker.length > 0) { | |
if (props.config.llms.sagemaker.length > 0 || props.config.rag.enabled) { |
At this time, If rag is enabled, it will automatically create a sagemaker endpoint for embedding/cross encoding regardless of the config. (unless I am mistaken?)
Issue #: N/A
Description of changes:
Don't create the Sagemaker endpoint if sagemaker is not enabled in config.
Don't create the StepFunctions endpoint if RAG is not enabled in config.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.