Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCR fails on batch ingest #251

Open
ajstanley opened this issue Apr 20, 2022 · 5 comments
Open

OCR fails on batch ingest #251

ajstanley opened this issue Apr 20, 2022 · 5 comments

Comments

@ajstanley
Copy link
Contributor

I thought we had this one sorted, but if we are doing a batch import with images being tesseracted then the process manager attempts to run more than one instance, (I think it goes for five) which overwhelms the container, and instead of seeing a new OCR we get nothing but timeouts. A single process uses about 65% of the CPU cycles.

When I'm doing migrations I pull the existing OCR over instead of creating new, but this won't work if there is no existing OCR.

It continues to work well for one-by-each ingestion.

@noahwsmith
Copy link
Contributor

Which container tag are you testing on, Alan?

@ajstanley
Copy link
Contributor Author

ajstanley commented Apr 21, 2022 via email

@noahwsmith
Copy link
Contributor

And with a new clone of ISLE-DC, or no?

@ajstanley
Copy link
Contributor Author

ajstanley commented Apr 21, 2022 via email

@ajstanley
Copy link
Contributor Author

ajstanley commented Apr 21, 2022

But let me run a few more tests before you do anything - I've run into some other problems on the server that may have obscured the real problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants