-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG]: White screen crash during indexing/embedding. Too many docs in workspace? Workspace needs a cleaning up function? #928
Comments
What type of files are you uploading to the document uploader? The crash might be happening due to some hidden files that are incomplete but also what is the reason you are uploading 3000 files at one time? Regarding the lag that happens, this is happening just because you are loading in so many files at a time and your CPU cannot keep up with rendering all 3000 files. If you want to embed this many files without relying on your CPU to do it, we suggest you use a cloud embedding model like OpenAI since it can handle much more than just your local CPU would. |
@shatfield4 thanks you for your response. I'm uploading PDFs. But I'm not uploading them at the same time. Even if I just add any single file the crash occurs. Once the folder ist emptied the system works fine again. Is it necessary to keep does files in that folder? As it seams if I delete them I might get a lower quality response, but I'm not sure if that's really the case. Do does files interact with the retrieval process or is for that only the information in the DB been used? |
It is necessary to keep the files in that folder because that file is a metadata file and is how the document picker knows which files are in the workspace and available. If you delete those files in the custom-documents folder manually, the documents are still embedded in your vector database so even though your workspace says there are no documents, you will still get context from that document and the RAG will still happen (this does not make the results less accurate by deleting those files). This makes me think that you have a PDF file that may be corrupt or empty that is causing the crash. Is there a certain file that you can upload to replicate this bug consistently or does this only happen when you just upload lots of documents? |
Thank you for explaining. I understand now better how the DB an the json Files work together. Meanwhile I have uploaded and processed new files. First everything worked fine and fluently now that I reached 2600 files the same error started to occurs again. Sometimes after restarting I was able to add new documents but once the folder contained more than 2700 files the error appears every time. That means electing a file in "My Documents" is slow and when I press "Move 1 file to workspace" anythingllm crashes after 20 sec in a white screen. So I don't think the error is related to a single corrupt file but to the size of the files in the folder "custom-documents". |
I am seeing this as well. There is some cliff that is hit around 180k vectors maybe. I can upload far more files, but if I try to embed even one more I get the white screen of death and no Task Manager activity related to background processing. Using Desktop v.1.40 |
Thank you for confirming. But I don't think it's related to the DB and the vectors because when you delete the files from custom-documents the issue is gone and that step did not affect the DB. |
You probably hit a vector DB limit somehow. A fix could be to be able to deploy different instances of lancedb and not have all the documents connected with just one db instance. After a certain amount of vector chunks it "cuts" the db into another vector |
Look at this: |
Adding another voice to this - am also seeing the same behaviour above a certain number of documents. |
Moved to #2317 |
How are you running AnythingLLM?
AnythingLLM desktop app
What happened?
Adding a new documents seams to be a bit delayed. Then selecting the file is very laggy. Clicking "move 1 file to workspace" results in a white screen crash.
The issue seams to be caused do to too many files in the Workspace. In my case the error appears once the folder contains more than ~2500 files or a size >100 MB (RAM usage is only at 66%).
If I delete the files in the folder "custum-documents"* or rather push them to an other folder, that removes all files from the workspace and the problem seams to be solved. But I'm not sure if these files are still needed and my workaround is a good solution?
*(C:\Users[name]\AppData\Roaming\anythingllm-desktop\storage\documents\custom-documents).
Are there known steps to reproduce?
System: 14-Core Intel Xeon E5-2690 v4, 3166 MHz on MSI X99S Gaming 7 (MS-7885), 8x 16 GB DDR4-3200 DDR4 SDRAM, NVIDIA GeForce GTX 970, Windows 10 Pro 10.0.19045.4170, Anything 1.3.1
It could be that the error occurs after more than 3000 files in the mentioned folder.
The text was updated successfully, but these errors were encountered: