Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify _log_input_summary function #7811

Open
wants to merge 20 commits into
base: dev
Choose a base branch
from

Conversation

hyacinth97223
Copy link

@hyacinth97223 hyacinth97223 commented May 27, 2024

Fixes #7513

Description

The _log_input_summary() function has been modified. The original output method using logger.info() has been changed to log_buffer.write().

Types of changes

  • Non-breaking change (fix or new feature that would not break existing functionality).
  • Breaking change (fix or new feature that would cause existing functionality to change).
  • New tests added to cover the changes.
  • Integration tests passed locally by running ./runtests.sh -f -u --net --coverage.
  • Quick tests passed locally by running ./runtests.sh --quick --unittests --disttests.
  • In-line docstrings updated.
  • Documentation updated, tested make html command in the docs/ folder.

log_buffer.write(f"> {name}: {pprint_edges(val, PPRINT_CONFIG_N)}\n")
log_buffer.write("---\n\n")

logger.info(log_buffer.getvalue())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change may still not solve the issue mentioned in #7513?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What it seems to me will happen here is the message will be written all at once with one call to logger.info() which has a greater chance of appearing all at once in the output in the presence of multiple GPUs. This is only by chance and isn't guaranteed. What we need in the case of multiple GPUs is to identify which process is rank 0 and only it writes to the logger by default but with an option to allow everyone to write if desired. With the torch.distributed package this would be:

import torch.distributed as dist
...
rank = dist.get_rank() if dist.is_initialized() else 0

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ericspod Thanks for your advise. I will try it.

hyacinth97223 and others added 17 commits June 6, 2024 16:22
… into fix-issue-7513

 I modify the _log_input_summary function and add the following
 functionality:
    Logs the input summary of a MONAI bundle script to the console and a local log file.
    Each rank's logs are tagged with their rank number and saved to individual log files.
    Reads the base log path from the logging.conf file and creates a separate log file for each rank.
    Add log_all_ranks as a parameter to determine whether to log all ranks or only rank 0.
… into fix-issue-7513

Logs the input summary of a MONAI bundle script to the console and a local log file.
Each rank's logs are tagged with their rank number and saved to individual log files.
Reads the base log path from the logging.conf file and creates a separate log file for each rank.
!Add log_all_ranks as a parameter to determine whether to log all ranks or only rank 0.
Signed-off-by: Ted.Lai <[email protected]>
… into fix-issue-7513

 Logs the input summary of a MONAI bundle script to the console and a local log file.
 Each rank's logs are tagged with their rank number and saved to individual log files.
 Reads the base log path from the logging.conf file and creates a separate log file for each rank.
 Add log_all_ranks as a parameter to determine whether to log all ranks or only rank 0.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enhance bundle logging logic
3 participants