-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Significant slowdown on large files #10
Comments
This sounds like a good idea to me! There's a little extra complexity, but it's hidden from the user. And it makes total sense that it would speed things up. |
FYI, started working on this in https://github.com/nmwsharp/happly/tree/compressed_list The core data structure for storing list properties is changed to a flat list in 82a5dd2. Seems to make just parsing in binary data ~40% faster. Still need to expose direct access to the flat list via the API. |
I took a look at that and it seems fine. Giving access to list properties directly will also help too (otherwise, we still get the constructor called many times). I took a crack at a similar thing in our library including adding helpers for loading points, lines, triangles, quads. For us, it works really well and scales well. If you care, I can e up a summary here of what one might want to do. |
Great! Agreed about direct access to the underlying buffers, can implement an API for that soon. Any tips you can share about helpers are certainly appreciated. I spent a while looking at performance in happly a few weeks ago. Although some of the worst offenders (like nested By the way, @mhalber has an awesome benchmark of ply reader/writers here: https://github.com/mhalber/ply_io_benchmark, if anyone is looking for something faster. At least happly does very well in terms of lines of code :) |
The current version of happly is quite slow for large files compared to an home-brewed solution I cooked up. The profiler suggest that the problem is allocating many small vectors in list properties. On the Lucy model from the Stanford repo, happly takes about 16 seconds of which 7 are just vector allocs. My home-brewed solution takes half that time.
I propose the following changes:
vector<vector<T>> data
to three vectors for start, count and datastd::vector<size_t> start; std::vector<uint8_t> count; vector<T> data;
, where data has the concatenated list of elements, start has the starting index for each list and count contains the lists sizesgetListProperty(vector<array<T, N>>& data, vector<uint8_t>& count)
to read the data in preallocated lists; maintain previous versions for backward compatibilityIf this sounds good, I may even take a crack at it, but only if this feels right.
The text was updated successfully, but these errors were encountered: