Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid shedding requests on the initial load #4

Open
blind-oracle opened this issue Oct 9, 2024 · 3 comments
Open

Avoid shedding requests on the initial load #4

blind-oracle opened this issue Oct 9, 2024 · 3 comments

Comments

@blind-oracle
Copy link

When the initial load is applied - little_loadshedder discards a number of requests.

I was wondering if we can add some option to initially just do the measurements and only after some time or some number of requests to enable shedding.

@Skepfyr
Copy link
Owner

Skepfyr commented Oct 11, 2024

Hmmmm, that's a reasonable request although I can't immediately see a way to implement it. My thoughts when implementing this is that when starting up a service you'd slowly ramp-up load to it via an external mechanism, but that does require a lot of other infrastructure.

One option would be to let users specify the initial values for concurrency and queue size (or something equivalent like concurrency and "average latency at concurrency". I'd want to add a nice way of extracting those values from a test instance, but it would let it start up with good guesses for the queue sizes.

The other option which you've suggested is have some sort of grace period, that';s possible but I'm not sure how it would work. How do you decide when the loadshedder should kick in? Also, it currently uses latency measured while at concurrency capacity to determine what the best value is and it's not obvious it would get good data if it wasn't limiting the concurrency.

What do you think? What would be the most useful to you?

@blind-oracle
Copy link
Author

Thanks for the response!

Yes, it's a bit tricky, I agree.

For now I've forked the little-loadshedder and initially implemented a grace period in the number of requests. While the grace period is active (e.g. first 1k requests) it just measures the latency of the inner service. But as you said it's not perfect since we're not limiting the concurrency - I am yet to test this in our production.

Gracefully ramping up the load does not nicely work in certain use cases like ours - we use DNS for load balancing and the load is applied quite steeply which leads to false positives.

Frankly I'm not sure what would be the right approach, but maybe the one that you suggested with specifying the initial values for queue/concurrency semaphores would be a good fit for most cases. I can try to implement that approach also, especially if the grace period approach won't work as intended and see how that works on real load.

@Skepfyr
Copy link
Owner

Skepfyr commented Oct 11, 2024

That would be great! As you can probably tell I'm not maintaining this library particularly actively but I'll happily make small changes like this and review anything you throw my way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants