Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSVSource performance with many rows #38

Open
jameswilddev opened this issue Oct 1, 2021 · 0 comments
Open

CSVSource performance with many rows #38

jameswilddev opened this issue Oct 1, 2021 · 0 comments

Comments

@jameswilddev
Copy link

CSVSource appears to parse the file up until the current page, once per page. As there are only 10 rows per page, for a 100000 row CSV this means that the CSV file is actually read and parsed 10000 times per migration.

I'm able to improve migration performance a lot by bumping the perPage to 100 or 1000, but, there might be a better way of improving performance here; a single scan through the file at construction to generate an array of offsets into the file of each page, using that to skip to the appropriate section of the file when a page is requested?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant