Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect Placement of writer.close() in PyTorch Experiment Tracking #1124

Open
Maysixi opened this issue Oct 22, 2024 · 0 comments
Open

Incorrect Placement of writer.close() in PyTorch Experiment Tracking #1124

Maysixi opened this issue Oct 22, 2024 · 0 comments

Comments

@Maysixi
Copy link

Maysixi commented Oct 22, 2024

Description:

In the current code for tracking experiments in 07. PyTorch Experiment Tracking, there is an issue with the placement of writer.close(). The SummaryWriter is closed after each epoch, which causes logging to stop prematurely. This leads to incomplete logs when training spans multiple epochs. The writer.close() should only be called after all epochs have finished, not after each epoch.

Code Reference:

### New: Use the writer parameter to track experiments ###
# See if there's a writer, if so, log to it
if writer:
    # Add results to SummaryWriter
    writer.add_scalars(main_tag="Loss", 
                       tag_scalar_dict={"train_loss": train_loss,
                                        "test_loss": test_loss},
                       global_step=epoch)
    writer.add_scalars(main_tag="Accuracy", 
                       tag_scalar_dict={"train_acc": train_acc,
                                        "test_acc": test_acc}, 
                       global_step=epoch)

    # Close the writer
    writer.close()  # This line causes the issue
else:
    pass
### End new ###

Proposed Solution:

Move the writer.close() statement outside the training loop, so that it is only called once after all epochs have been completed.

Expected Behavior:

The SummaryWriter should continue logging across all epochs.
Only after the full training process is complete, should the writer be closed.
##nSteps to Reproduce:
Implement the current code where writer.close() is inside the loop.
Run a training process for multiple epochs.
Notice that logging stops after the first epoch due to the writer being closed too early.

Suggested Fix:

# After training loop, close the writer
if writer:
    writer.close()

Let me know if you need any further clarifications!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant