Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: examples error handling #64

Merged
merged 1 commit into from
Aug 6, 2024

Conversation

raresgaia123
Copy link
Contributor

@raresgaia123 raresgaia123 commented Aug 2, 2024

added error handling for examples
updated ccl operaions with rank 0 as the actor
updated logs for a better understanding of the operations

Description

Please provide a meaningful description of what this change will do, or is for. Bonus points for including links to
related issues, other PRs, or technical references.

Note that by not including a description, you are asking reviewers to do extra work to understand the context of this
change, which may lead to your PR taking much longer to review, or result in it not being reviewed at all.

Type of Change

  • Bug Fix
  • New Feature
  • Breaking Change
  • Refactor
  • Documentation
  • Other (please describe)

Checklist

  • I have read the contributing guidelines
  • Existing issues have been referenced (where applicable)
  • I have verified this change is not present in other open pull requests
  • Functionality is documented
  • All code style checks pass
  • New code contribution is covered by automated tests
  • All new and existing tests pass

Copy link
Contributor

@myungjin myungjin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewed until all_gather, all_reduce, and broadcast and left comments. Those comments apply to the remaining operations - gather, reduce and scatter. please consider to address them.

examples/all_gather/m8d.py Outdated Show resolved Hide resolved
examples/all_gather/m8d.py Outdated Show resolved Hide resolved
examples/all_reduce/m8d.py Outdated Show resolved Hide resolved
examples/all_reduce/m8d.py Outdated Show resolved Hide resolved
examples/all_gather/m8d.py Outdated Show resolved Hide resolved
examples/all_gather/m8d.py Outdated Show resolved Hide resolved
examples/all_gather/m8d.py Outdated Show resolved Hide resolved
examples/all_reduce/m8d.py Outdated Show resolved Hide resolved
examples/all_reduce/m8d.py Outdated Show resolved Hide resolved
examples/all_reduce/m8d.py Show resolved Hide resolved
@myungjin
Copy link
Contributor

myungjin commented Aug 2, 2024

This PR is not a feature. This PR is mostly refactoring. So, please update the commit title in git.

@myungjin myungjin changed the title feat: examples error handling refactor: examples error handling Aug 2, 2024
@raresgaia123 raresgaia123 force-pushed the examples_error_handling branch 3 times, most recently from ebf52e0 to 828f13b Compare August 6, 2024 11:21
Copy link
Contributor

@myungjin myungjin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few comments left.

examples/broadcast/m8d.py Outdated Show resolved Hide resolved
examples/scatter/m8d.py Outdated Show resolved Hide resolved
examples/scatter/m8d.py Outdated Show resolved Hide resolved
added error handling for examples
updated ccl operaions with rank 0 as the actor
updated logs for a better understanding of the operations
@raresgaia123 raresgaia123 force-pushed the examples_error_handling branch from 828f13b to f825387 Compare August 6, 2024 16:07
@myungjin myungjin merged commit 78b73a8 into cisco-open:main Aug 6, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants