Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address issues when downloading large matrices as tsv data. #503

Merged
merged 2 commits into from
Jun 13, 2023

Conversation

bmbroom
Copy link
Member

@bmbroom bmbroom commented Jun 13, 2023

These two commits address the memory issues with downloading large TSV files from the NG-CHM (see issue #502).

The first tries to minimize the memory required.

The second shows a warning dialog for large downloads. It's currently set to show at 1 million array elements or more. This is way less than I've ever encountered issues.

The dialog also shows a progress bar during the large download and the user can cancel the download at any time.

The browser can fail with an out-of-memory error when trying to download a
very large data matrix (in my tests hundreds of gigabytes worth).

This patch uses several strategies to increase the download size at which
that happens:

- Gets access windows one row at a time and only for the range of
  requested columns. This minimizes tile cache memory needed.
- Converts rows to tsv format on the fly, so we don't need to convert
  the entire matrix to tsv format at once.
- Constructs a blob using a vector of the row tsv data.  This is more
  memory efficient than manually building a data URL.

These steps help but don't eliminate the problem.  (I don't think that's
possible purely in browser.)  A future patch will a display warning notice to
the user for very large download sizes.
If the number of array elements to download exceeds a threshold (currently
one million array elements) show a warning to the user that the download
may kill the browser due to memory exhaustion.

If the user chooses to proceed, a progress bar is displayed.

If the warning dialog is displayed, the user can cancel the download.

For large downloads there is a noticeable delay between when we have
finished all processing (and hide the dialog) and when the browser is
ready to save the file.  The browser can still crash during this time.
I'm not sure if there's anything that we can do about that.
@bmbroom bmbroom changed the title Address issues when downloading large matrices as tsv data. See issue #502. Address issues when downloading large matrices as tsv data. Jun 13, 2023
Copy link

@jmelott jmelott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this same issue occur when sending data to the builder or does this only happen when the user is downloading the selected data as as TSV?

Copy link
Contributor

@marohrdanz marohrdanz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to work as expected in my browsers.

@bmbroom bmbroom merged commit 537818c into main Jun 13, 2023
@bmbroom bmbroom deleted the big-download-warning branch June 13, 2023 20:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants