Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix rare duplicated data buffer entry #62

Merged
merged 2 commits into from
Jun 17, 2024

Conversation

youliangtan
Copy link
Member

@youliangtan youliangtan commented Jun 17, 2024

Issue

In bad network connection, agentlace's TrainerClient might not track the latest inserted batch data correctly, which resulted in duplicated data entry from the actor node to learner node. This will degrade the performance of online learning, since the sequential nature of the data is no longer obeyed.

Fix

[Highly Recommended]

The straightforward fix is to bump up the version of agentlace (or pull the latest main). For more details, checkout the PR: youliangtan/agentlace#11

Additional

  1. The current update() method uses a pure service request response method, it is blocking and might not be fast enough. An experimental mode with experimental_pipeline_url experimental_pipeline_port method is introduced to speed up the method.

Related to youliangtan/agentlace#13

  1. Reduce default size of QueuedDataStore to reduce unnecessary memory consumption on the actor node
  2. Misc clean up of code

@youliangtan youliangtan marked this pull request as ready for review June 17, 2024 18:15
@youliangtan
Copy link
Member Author

@charlesxu0124 @jianlanluo FYI

@youliangtan youliangtan merged commit bacdfc6 into main Jun 17, 2024
1 check passed
@youliangtan youliangtan deleted the fix-potential-duplicate-data-buffer branch June 17, 2024 18:16
@youliangtan youliangtan mentioned this pull request Jun 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant