-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache check_ is_ active_ rm #114
Comments
I need to wait about 20 seconds for one execution. It seems abnormal. Is it the wrong way I use it? |
It seems like you must have some network issues. The usage model is to get an RM instance, then use that instance to query applications and manage their lifecycle - each of which hits different endpoints. Seems like using the |
I had some thinking about this question myself few years ago, but here's why I still haven't done so:
Now, enterprise distributions of Hadoop usually include Knox gateway that typically deals with HA concept on its own. If you have single direct Knox url you, however, don't need to check if cluster is active at all, because it always considered so. So in my opinion what we could consider:
|
Good points @dimon222. I definitely think if we were to add any kind of caching, it should be optional (i.e., configurable). I think understanding the current scenario is more warranted though. I've never encountered such a delay. @dimon222 - do you know under what circumstances there might be such a delay to get a response from the |
Honestly, nothing particular in my experience. Out of the head guesses - high network latency, overloaded underline YARN platform. Doubt those scenarios would be a surprise... |
Thank you very much for your reply and suggestions. Now in my project, I will run a long-term application. I used pickle' to get the instance of ResourceManager during initialization and persist it. Now it works very well. But waiting about 20s for initialization is not a good idea. Maybe this persistence can be configurable? I don't think standy and active of hadoop HA switch very frequently. |
Thanks again, very good thinking direction. I'll go deeper to see the reason. Maybe the reason is not as it seems. If only my cluster will wait for such a long time, I will do more tests. |
More than 50,000+applications have been submitted in my production cluster, and the number will continue to increase. If there are so many applications, |
@lxorc thanks for this information. Indeed, it sounds plausible to achieve it when cluster page takes long to load. I still have to review available endpoints, but if there's one that can play the role of active mode health check, it's good idea to consider replacement. That is also in top of above suggestions (optimization-wise) |
rm = ResourceManager()
Spend a lot of time oncheck_ is_ active_ rm
function , could i cache it ?The text was updated successfully, but these errors were encountered: