Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query never completes #60

Closed
bbonf opened this issue Jan 26, 2023 · 2 comments
Closed

Query never completes #60

bbonf opened this issue Jan 26, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@bbonf
Copy link

bbonf commented Jan 26, 2023

For example: //node[@rel="det"] over the CGN corpus on the prod/acc servers.
I tried running several similar queries and they seem to work on gretel4 but fail on 5.

Might be because there are too many results + something related to deployment.

@bbonf bbonf added the bug Something isn't working label Jan 26, 2023
@bbonf
Copy link
Author

bbonf commented Feb 1, 2023

What happens is that such queries with an incredible number of results get killed by celery timeout (configured to 600 seconds). This is good, but the problem is that the frontend receives no indication about a timeout/error occurring, and to the user it seems like the query is stuck. (#26)
The second issue, is that BaseXService is written to read the entire query response first, and only then proceed with parsing etc. That causes the application to retrieve 0 results, while in fact it's spending a lot of time in the background receiving all the data before everything is lost.

I think the desirable outcome here would be to show the user as many results as possible while the query is still running. Then when the query times out, show a relevant warning explaining that there are too many occurrences for the query to be fully executed, and that the user should narrow it down.

Notes:

  • TBD: in "normal" queries results do come in while the query is still being executed. Is that because we never progress past the first corpus component?
  • Look into replacing BaseXClient.execute() with BaseXClient.iter()
  • Is it better to kill the query when there are too many results coming in (e.g. per component), before reaching the 600 seconds timeout?
  • Alternatively, if we are able to display partial results (showing that the query is too broad) and implement Add cancel query button #33, we could add a message along the lines of "this query is taking too long to execute, would you like to cancel it?"

@bbonf
Copy link
Author

bbonf commented Feb 23, 2023

With #63 and #81 this works well on acc. May be closed after merging.

A separate issue could be made for gracefully handling timed out queries, but that would be of a lower priority (see #26)

@bbonf bbonf closed this as completed Feb 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant