Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: use pdf filepath to cache asset index #1191

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

atahanacar
Copy link

@atahanacar atahanacar commented Mar 25, 2024

Don't compare pdf files byte by byte when adding it to OrderedAssetsCache. This causes performance issues as this operation is done for every page, which gets very expensive with many pages and big pdf sizes.

Cache the index of the asset using the pdf's file path.

Closes #1189

Don't compare pdf files byte by byte when adding it to OrderedAssetsCache.

Cache the index of the asset using the pdf's file path.
Copy link

codecov bot commented Mar 25, 2024

Codecov Report

Attention: Patch coverage is 0% with 14 lines in your changes missing coverage. Please review.

Project coverage is 47.74%. Comparing base (c1c3059) to head (094a488).

Files with missing lines Patch % Lines
lib/components/canvas/image/pdf_editor_image.dart 0.00% 10 Missing ⚠️
lib/components/canvas/_asset_cache.dart 0.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1191      +/-   ##
==========================================
- Coverage   47.80%   47.74%   -0.06%     
==========================================
  Files         115      115              
  Lines        9094     9104      +10     
==========================================
  Hits         4347     4347              
- Misses       4747     4757      +10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@QubaB
Copy link
Contributor

QubaB commented Mar 26, 2024

It will be nice to push this fix as soon as possible because the long saving of notes with large pdf files is annoying (#1189).

@atahanacar
Copy link
Author

I think this should be just a temporary fix as pdf file assets are still overwritten on every save. I will work on a better fix when I have the time, as I think there are more important bugs to fix.

@QubaB
Copy link
Contributor

QubaB commented Mar 26, 2024

I think so.

I suppose that @adil192 did not expect such great use of large pdfs as I observe now. It seems to me that a lot of users do not use Saber to handwriting of notes but to comment pdfs.

Copy link
Contributor

@QubaB QubaB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 102-115 in lib/components/canvas/_asset_cache.dart

if (index == -1 && value is List<int>) {
      // Lists need to be compared per item
      final listEq = const ListEquality().equals;
      for (int i = 0; i < _cache.length; i++) {
        final cacheItem = _cache[i];
        if (cacheItem is! List<int>) continue;
        if (!listEq(value, cacheItem)) continue;
        index = i;
        break;
      }
    }
    log.fine('OrderedAssetCache.add: index = $index, value = $value');

should be removed. They are causing slow saving of pdfs by byte to byte comparison.

AddFileIndex and getFileIndex solve this and it is no more needed.

@atahanacar
Copy link
Author

atahanacar commented Mar 28, 2024

I think they should stay for non-file assets (don't know if they exist, will check when I get back home). We can have another method in OrderedAssetCache that accepts a file as an argument, so it uses the new filepath cache implemented in this PR. This way we can also avoid overwriting the entire asset file on every save altogether.

@QubaB
Copy link
Contributor

QubaB commented Mar 28, 2024

This code was added by me and will be executed only when value is List it means file. It was introduced because _cache.indexOf(value); sometimes returned -1 index even if value was already in assets (it is probably problem of calculation of hash inside Object class.)

Imagine situation of two images (pdf files) in a note. When index == -1 (it means that asset is not in cache yet) image bytes will be compared to all already stored assets byte by byte which is slow. But your fix does not allow to call assets.add when it already knows ID of asset.

So in case of more images each new image will be at least once compared to all previously stored images.

Please test your code with more than one pdf files imported.

I hope my explanation is clear.

@atahanacar
Copy link
Author

Yes, it is very clear. Thanks. I will do more testing and see how I can safely remove your code without causing the same issue you fixed. However, in the meantime I think this PR should be merged as soon as possible.

Also as a funny side note, I've been wanting to begin contributing code to this project for a long time but was procrastinating. The long saving bug your fix introduced actually made me finally start it, so I want to thank you for that 🤣

@QubaB
Copy link
Contributor

QubaB commented Mar 28, 2024

The long saving bug your fix introduced actually made me finally start it, so I want to thank you for that 🤣

I am glad you started contributing. I am a FORTRAN programmer of scientific calculations and my contributions are clumsy. Someone else will do it better.

@atahanacar
Copy link
Author

@adil192 please merge this as soon as possible, this is a critical issue that affects all platforms. I can't use the official or F-Droid builds because of this.

@Luciogi
Copy link

Luciogi commented Jun 8, 2024

@adil192 any thoughts on this ?

@QubaB
Copy link
Contributor

QubaB commented Aug 2, 2024

Please @adil192 consider to merge this branch. It improves time of saving note when it contains more large pdf files (avoids comparing each pdf file byte by byte).

@github-actions github-actions bot mentioned this pull request Aug 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Saving note causes freeze for 10-60 seconds depending on the imported pdf's size.
3 participants