-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
System failures with 12 concurrent Qs #42
Comments
Making it critical based on discussion at UI checkin this morning. |
@sharatisrani please try running these queries again. I believe the issue was that the Answer Appraiser was running out of memory and giving the ARS back 502 errors, and then the ARS was reporting a 422. We have bumped up the resource allocation of the Appraiser and I've made sure that it can handle 12 concurrent queries. |
Tested again, on CI. There's obvious progress. But still 3 out of 12 gave 422 failures. With this progress, I also saw some ARAs that were routinely returning 0 results, which I doubt they wanted to be the case. And other specifics that Error 429's from ARAX for MVP2. So I have opened several other issues. All are marked showstopper till the decider (Tyler perhaps) decides these are not show stopper. All will be discussed in the O&O WG Aug 31. The 422 failures may be tied to ARA failures, above paras. Hard to know. Eg always occurred when ARAGORN and BTE both returned 0 results. Is that a coincidence, or what? Here are the 12 pks. The error 422s were Sclerosis, Ehlers-D and Gauchers. MVP1 MVP2 Based on the UI checkin meeting this morning, I am converting this to ShowStopper. I think it can be rapidly resolved. |
The 3 failures are coming from the node annotator service (I think it's managed by BTE?) that is returning something back other than JSON. @sharatisrani could you please open a ticket either in the Feedback repo or on BTE itself wrt this issue? |
These 12 Q's were run concurrently (8 MVP1, 4 MVP2), to see the scores that can back from the ARS. Many failures resulted. It is possible that subsystems failed, as there is much new code under O&O - such as appraiser, novelty calculations, etc. All were run on CI
Barth's Disease. "pk": "b122cf57-459d-4fc7-a907-f3223a81e067",
Familial Insomnia "pk": "b74c5679-9f5d-4a26-8236-bf26d745ffd6",
Mastocytosis "pk": "42e7bc51-05c6-4691-ab64-84d7c66adfc4",
Systemic sclerosis "pk": "808ef931-7047-4f43-b419-25a6b97b900b",
Ehlers-D "pk": "74e04721-bc59-473c-9749-bc23ae454f54",
Gauchers Disease "pk": "46d3dc1e-8ef4-4f44-a14d-e472fe026912",
Nemaline "pk": "ff7fd0d9-8a74-4743-963b-7f36d68fad4a",
T2D "pk": "44a6949f-73f8-4c45-9953-35a35c233098",
BRCA1 "pk": "8dea69f3-4277-447c-8175-4acfae549026",
PCSK9 "pk": "ce4f2375-e6f5-413f-90e8-a4769ddce08d",
MUC5B "pk": "b35b849e-d8d4-4182-b6ea-af8c615f9ef8",
SMARCE1 "pk": "fe5af7af-cea6-48f7-a668-44d5bc9fae8f",
Thank you @maximusunc for looking into this - pls reassign based on what you find.
The text was updated successfully, but these errors were encountered: