You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This task is an invitation to discussion initially started here.
As a software engineer I have an access to monitoring of serveral backend services. Those services heavily rely on OkHttp to perform API calls to other services. I've noticed that on those services operations on AsyncTimeout like scheduling can take significant amount of time - it is top 1 CPU intensive function.
According to DataDog AsyncTimeout also sometimes become a bottleneck due to application wide lock held to perform long CPU intensive operation.
This is presumably due to implementation detail of AsyncTimeout which is implemented as sorted linked list. An insertion to such list is a linear scan which takes more and more time when service has many requests in parallel and probably many timeouts scheduled.
I'll try to propose several options to overcome this issue. They are all different in complexity of implementation, but worth discussion.
As on option it is possibly to use ScheduledThreadPoolExecutor to delegate efficient timeout scheduling to JDK. Under the hood ScheduledThreadPoolExecutorimplemented as binary heap which will provide O(log(n)) operations on list. The drawback is that each task scheduled to ScheduledThreadPoolExecutor is wrapped into internal object and stored into array - which means additional allocations. I personally don't think it will be a big problem, but it worth noting this drawback.
Alternatively it is possible to adjust current implementation of AsyncTimeout which seems to be intrusive singly linked list to be intrusive liked binary heap which will also provide O(log(n)) operations without any extra allocations. The drawback of this approach is that it is much more complex to implement (those manipulations over binary tree are tricky), but this approach to timeouts is known and used by one of the most widespread native library for IO.
As suggested by @swankjesse it also might worth just to change direction of lookup which might reduce amount of elements of list traversed. Presumably due to fact that most of timeouts will be the same. Here in my opinion it will need to be validated that hypothesis about same timeouts is actually true. There might be something like timeout groups coming from different requests with different timeouts. Also different timeout can come from other timers like timers assumed by H2 protocol.
The text was updated successfully, but these errors were encountered:
This task is an invitation to discussion initially started here.
As a software engineer I have an access to monitoring of serveral backend services. Those services heavily rely on
OkHttp
to performAPI
calls to other services. I've noticed that on those services operations onAsyncTimeout
like scheduling can take significant amount of time - it is top 1 CPU intensive function.According to DataDog
AsyncTimeout
also sometimes become a bottleneck due to application wide lock held to perform long CPU intensive operation.This is presumably due to implementation detail of
AsyncTimeout
which is implemented as sorted linked list. An insertion to such list is a linear scan which takes more and more time when service has many requests in parallel and probably many timeouts scheduled.I'll try to propose several options to overcome this issue. They are all different in complexity of implementation, but worth discussion.
As on option it is possibly to use
ScheduledThreadPoolExecutor
to delegate efficient timeout scheduling toJDK
. Under the hoodScheduledThreadPoolExecutor
implemented as binary heap which will provideO(log(n))
operations on list. The drawback is that each task scheduled toScheduledThreadPoolExecutor
is wrapped into internal object and stored into array - which means additional allocations. I personally don't think it will be a big problem, but it worth noting this drawback.Alternatively it is possible to adjust current implementation of
AsyncTimeout
which seems to be intrusive singly linked list to be intrusive liked binary heap which will also provideO(log(n))
operations without any extra allocations. The drawback of this approach is that it is much more complex to implement (those manipulations over binary tree are tricky), but this approach to timeouts is known and used by one of the most widespread native library for IO.As suggested by @swankjesse it also might worth just to change direction of lookup which might reduce amount of elements of list traversed. Presumably due to fact that most of timeouts will be the same. Here in my opinion it will need to be validated that hypothesis about same timeouts is actually true. There might be something like timeout groups coming from different requests with different timeouts. Also different timeout can come from other timers like timers assumed by H2 protocol.
The text was updated successfully, but these errors were encountered: