-
-
Notifications
You must be signed in to change notification settings - Fork 379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GeoPatch/GeoSphere refactoring #5979
base: master
Are you sure you want to change the base?
GeoPatch/GeoSphere refactoring #5979
Conversation
There should be a planet icon in the set already, but I can draw another one if needed/wanted |
I'm going to have to break out NSight and dig into some of this code to get an accurate read on the GPU performance implications - currently productively procrastinating by working on other areas of the render code so it will likely be a day or more before I start going through the code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to add comments here incrementally as I have time today.
Ran it through NSight - thanks for making the distance sort a configurable option! I'm seeing similar results to what you described; terrain rendering takes 0.5-0.6ms regardless of option, and we're strongly limited by register allocation and basic floating-point unit throughput (no surprise there). Assuming we have early-Z fragment discard working properly (no way to check as far as I can tell) doing a distance sort could allow ~5-10% of fragments to be skipped in the best case. On a 2070 Super, that is a negligible difference in frame timing. On a laptop GPU or handheld device, that could be a few tenths of a millisecond (or more). I definitely think its an optimization worth keeping. Completely spitballing here, but have you considered frustum-OBB culling for the horizon culling method instead? I'd expect you'd get a significantly more accurate cull (assuming maximum-feature-height per patch is correctly implemented) at the expense of more CPU time. Frustum-OBB can potentially be decomposed to 5x AABB-Plane tests, which can be quite cheap (note: the link computes intersection, but could likely be amended to compute is-on-front). The main cost would be transforming each plane of the frustum to the orientation of the OBB (as well as the cost of storing the OBB). |
pioneer-profile-20241122_140455.zip |
Honestly I never considered doing 90% of this PR 🤣 I just got distracted whilst doing the padding and size redunction! So I'd not looked into more optimal culling designs yet. Also when I looked at the wireframe view it really didn't seem like there's a lot of easy wins left with round planets 🤷 Still might worth looking into something like Improved Frustum Culling eventually. However I don't want this PR to grow and grow. |
Heh, I know the feeling. The distance-sort change is definitely a win and there are a few other nice improvements in the PR as well.
It's probably not necessary at this point. I don't imagine we'd see a significant performance improvement, and I suspect half of the culling problem is that the horizon culling is culling based on the nominal radius of the SystemBody rather than the minimum feature height of the visible horizon. That's a non-trivial thing to compute however, as it varies based on view direction. I'll continue reviewing this as I have time - going to bounce around to a few other open PRs. The most recent commits you pushed look good, though if we adopt distance sort for terrain I'm not quite sure we need the terrain debug info tab? |
I have other uses planned for it, like freezing updates, toggling new rendering stuff I want to try out, pulling out some stats about updates, and generation latency between request and receive, how many patches we have at each depth level, etc. Things that don't seem to fit elsewhere but I hope will teach me more about how the terrain performs so I can think about refactoring it. |
I had a chance to run this on an old laptop with integrated graphics last night, it makes a big difference to the framerate on low-end hardware! I'd have to check again for get consistent framerate figures but it was ~20-25% improvement. |
Nice work! I'll give this a closer review soon. |
@impaktor I know the rendering stuff isn't your favourite area, but would you mind reviewing this with an eye to helping me cleanup the commit history as I'd really like to get this in sometime soon? |
I don't really have a good workflow / tools to go through githistory of a line (there's of course a mode for that in Emacs, but I've not tried it), so just skimming through these greek hieroglyphs and caching them to gray matter, looks like there would be some squash / fixup (if not familiar with rebase: "fixup" will squash the commit with the one above it): pick Label3DWrapper to wrap Label3D from Scenegraph for use elsewhere 2f9d3e0 pick Rework GeoPatch to reduce size 2871b33 pick Experiment sub-centroid patch culling f85f62c @fluffyfreak Do you want me to do rebase and force push back to this branch? |
My apologies for the delay - things have been quite busy on my end. I most likely won't be able to give this a full review until early next week, but cleaning up the git history will help make it easier to review 😄 |
@impaktor yeah that'd be awesome if you could. My work has exploded again. Always like this at xmas so I'm struggling with time and that'd be a big help 👍 @sturnclaw totally understand, life gets busy and time gets short all at once at this time of year |
…order. Absolutely no performance improvement at all! Possibly slightly worse.
- Created method IsPatchVisible to unify patch visibility testing - Don't test patch depth zero against horizon culling - Fixes issue 5806
Define 5-9 smaller spheres to use in horizon culling. Idea being that you can get tighter culling without a single large sphere visible over the horizon giving false positive visibility
Only used if DEBUG_PATCHES defined as 1
- Reimplement the Scenegraph::Label3D for Drawables - Might convert Scenegraph::Label3D to wrap this!
I don't even know why this got added
- used wireframe planet icon for Terrain tab - MSVC requires use of u8 before icon \u string names - added comment explaining icon name values
- remove defunct Centroid accessor
ef545f2
to
f0c8437
Compare
I've gone throuhg it. I expect the commits of a branch to be on the top of So I did a hard reset of this branch to master, then cherry-picked each commit manually to bring them to the front, then rebased 3 commits. Additional rebase could be done?I did get some strange conflicts when trying to rebase the "Core/Log.h" (so I aborted that one) pick 6ff8d36 * Rework GeoPatch to reduce size And I suspect these two could be squashed: pick bb13e4b * WIP PiGui::PerfInfo::DrawTerrainDebug But I leave that as an exercise to @sturnclaw, if he cares about it enough. And any additional rebase can easily be made now that the commits are at the tip of the branch. ReviewWhat ever requested changes, still apply, I assume, just the commit hash is new so I assume github has marked them as "outdated" Re-pulling your branchSince I've now borked your branch, I think you could do something like (assuming your github repo is
I have a copy of your original branch if you want to revert (plus it's somewhere in reflog as well) |
That is fantastic! Thanks so much @impaktor ! |
How it started:
I wanted to reduce the wasted space and padding in the classes GeoSphere and GeoPatch...
How's it's going:
... I reduced the wasted space and padding in the classes GeoSphere and GeoPatch, and then I went a bit crazy with refactoring things and just trying stuff out!!! 🤣
What else is here?:
I'd better just list things.
and icon neededPerformance Improvement?
Absolutely no performance improvement that I can tell but then this was all done in areas that medium to high-end PCs just don't struggle with. I haven't tested it on the RPi5 very much and not at all since the profiling has been working again.Tested on lower end machines and there is a definite performance improvement on the Raspberry Pi 5 and low-end x86/64 machines! Particularly with integrated graphics.