-
-
Notifications
You must be signed in to change notification settings - Fork 525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow importing with foreign keys #5097
Comments
Hey @ericmock, thanks for reaching out about this. We have recently done some work on data import (e.g. import performance overview and benchmarks, optimizing blob writes in imports), however, unique key constraints and foreign key constraints are still known to be expensive to check during larger imports. Our recommendation right now is to split up the data import into two steps – 1) importing all the data first, and then 2) applying unique key and foreign key constraints. Alternatively, you can also disable foreign key checks temporarily while you run the import (i.e. Let us know what you think and if any of those suggestions are helpful. |
I'd been breaking things into smaller chunks as a workaround. It still seemed to get bogged down. A garbage collection seemed to help a bit. I was not aware you could turn off the foreign key checks. Is that persistent across imports? |
If you want this behavior to be global in a Just note that when you turn Glad you found the garbage collection tip! We're working on some GC enhancements that should remove the need to manually run GC, but until then, explicitly running GC after an import is a great practice. btw... in case you didn't already find this... we have a guide for importing data dumps into Dolt, and there are a few best practices/tips listed at the end (including GC and a small mention of foreign keys). If you have any feedback on how that guide could have helped you more, we'd be happy to improve those docs. |
Thank you for the great information. I had never scrolled to the end of that importing guide. That said, it might be good to put those tips at the top of the page. I suspect most users will find the section for the type of data they want to import and not read further down. If I import with foreign key checking off, add data, then turn checking back on, dolt does not rescan the database to determine if constraints are violated, right? What if I try to merge that data with foreign-key violations with a database that has foreign-key checking turned on? |
You can run this: https://docs.dolthub.com/sql-reference/version-control/dolt-sql-procedures#dolt_verify_constraints Or the command line version. |
Thank you. Would there be any performance benefit to turning off constraint verification during import and then ensuring nothing was violated with the above command afterward? |
It will be way faster. Turning off foreign key checking is very common on import. |
Thanks. That's great to know. How would you (or can you) turn foreign-key checking off when using |
Yup, you can also turn off foreign-key checks when using The Thanks for the good feedback on the import docs (tracking issue)! Let us know how the rest of your import experience goes and if there's any other feedback or questions you have or anything we can do to help. |
Thank you for the feedback. I'm embarrassed that, after doing so many imports, I didn't notice the |
No need to be embarrassed!! We got some good feedback from you on the import experience and got some takeaways to improve the Data Import guide, so this was definitely useful for us. 😄 |
I'm gunna close this one out since I think we got everything sorted out now and we've got a separate issue for some import doc updates. Don't be shy to reopen or ping us on Discord if you still need anything here! |
While I have not done any explicit tests, it appears that the time it takes when importing data to a table with foreign keys grows like N^2. This isn't an issue when importing tables with thousands of rows, but it becomes very slow when importing tables with hundreds of thousands of rows. Are there optimizations that could be made?
The text was updated successfully, but these errors were encountered: