Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

min_size_to_sample param in Queue(RateLimiter) #114

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

markub3327
Copy link

Hello,

the users want to set the minimal count of samples in a table before sampling batches from it.

Thanks.

@sabelaraga
Copy link
Collaborator

Hey Martin,

Thanks for reaching out! The problem with something like this is that some elements will never be sampled from the queue, so I'm a bit reluctant to include this as a default configuration of the queue. However, you can achieve the same behavior by tuning the Table and the RateLimiter directly. Have you tried that? Is it causing any issue?

Thanks!

@markub3327
Copy link
Author

Sorry, I didn't try it directly with RateLimiter.

What will do if I use drop_remainder=True option of dataset.batch() in the queue context?
Is rational to use dropping remainder samples during create batch?

Thanks.

@sabelaraga
Copy link
Collaborator

We are usually reading from a "neverending" buffer, so usually the effect is mostly that the result of dataset.batch() has a known length.

@markub3327
Copy link
Author

Ok. I think it discards samples from the queue if the number of samples stored in the queue is less than the batch size.

I'll check if all elements will be sampled from the queue with min_size_to_sample > 1.

@markub3327
Copy link
Author

markub3327 commented Dec 9, 2022

@sabelaraga

You have a truth. If the count of stored data is less than min_size_to_sample, the client is frozen and waiting for new samples. On the other side, if I define the dataset batch_size and the number of stored data is less than batch_size, the client is frozen too and waiting for this minimal count but defined as batch_size.

Reverb has insufficient documentation. Please add these conclusions to your official documentation about Queue (maybe this principle is similar in other types of tables).....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants