Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory usage very high #10

Open
brianmario opened this issue Feb 4, 2021 · 2 comments
Open

Memory usage very high #10

brianmario opened this issue Feb 4, 2021 · 2 comments

Comments

@brianmario
Copy link

Hello!

First off, I'm so glad this library exists. I've been looking around for a pure-go par2 implementation for ages and recently came across this project. Nice work!

In my testing locally, it appears as though this library will load all file data into memory for processing. While that may be fine for a smaller dataset, my use case is in the 10s of gigabytes and obviously won't work.

I know this is likely still under active development, but would you consider having the API be stream-based? In a perfect world, I'm imagining everything being based on io.Reader and io.Writer interfaces. That way things can be processed in chunks, and a nice advantage is the source and destination streams aren't limited to being on-disk.

I actually would love to be able to hook this up to a virtual filesystem via the new io/fs package coming in go 1.16.

I'll try to take a stab at this, but I won't have much time to work on it until March or later. I mostly wanted to get this open to see if it was already planned work or not?

Thanks again!

@akalin
Copy link
Owner

akalin commented Feb 4, 2021

Yeap, it's a known problem, I just implemented the most basic thing that could have worked! 😅

Definitely would like everything to be stream-based -- io/fs might be a good fit, too. There might be some refactoring needed as some parts might do multiple passes over the data (but I'm not sure, I'd have to check). But yeah, this would be a pretty nice win.

@brianmario
Copy link
Author

Ah ok no worries :)

Based on what I could see, it would require a bit of refactoring to be stream based. But not impossible by any means.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants