-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
processing_rate is a multiplier and not absolute number #77
Comments
processing_rate really should be removed. It's a throttle to restrict the rate items are processed. If you don't want to artificially slow things down, then don't set it. |
So there's no way of knowing how many entities each shard will process? |
It will process as many as it can in the configured slice interval. Is there some reason you want to restrict it? |
I want to restrict the rate to entity-per-task so that it is easier to analyze the logs and track errors. |
Then, yes, that is the setting you want. All changes to the value should produce a linear change in the rate. |
When I pass
"processing_rate":1
as part ofmapper_params
and examine the logs of/mapreduce/worker_callback
I see that each worker callback processes 8 entities each time. If I set"processing_rate":2
each callback will process 16 entities. On another project I've worked on, the numbers were 15 and 30 (forprocessing_rate
of 1 and 2). So I conclude thatprocessing_rate
param is a multiplier.The text was updated successfully, but these errors were encountered: