Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(NOT READY FOR REVIEW)(torchx/components) Add aws.spmd component, a specialization of the d… #750

Closed
wants to merge 1 commit into from

Conversation

kiukchung
Copy link
Collaborator

For @seemethere since we discussed the custom component required to run on AWS Batch.
The next step is to figure out a way to bake this into the default spmd (torchx.components.dist.spmd) so that we don't keep two copies of spmd.

Publishing this as a PR so the @seemethere can take a look at what it takes to run on AWS Batch on EFA-enabled instances (e.g p4d).

Test plan:
Unittests included.

…efault spmd component for dist jobs on AWS Batch
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 3, 2023
@kiukchung kiukchung closed this Sep 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants