-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interested in building an autocast friendly version #472
Comments
Old version does not support mixed precision. |
@maturk Does the newer version support that? Any thoughts on whether there can be other solutions to optimize on speed/memory including fp16? |
It is not supported currently I think. |
There is this functionality in autograd that allows you to make particular functions fp32, maybe that helps? |
Can confirm, though you will have to clone the repo first, add Other than that works like a charm, now you can train in fp16/bf16 |
Use case:
While trying to use gsplat for inference at scale (talking 32+ GS inferences in parallel on a single device), I want to potentially use something like
torch.autocast
around gsplat to make things faster - to do inference in mixed-precision or Float16 - this will help reduce memory consumption and probably make things faster. Quality drop is not of significant concern but the speed is.Currently my gaussian splats take up 1GB each so this is not a significant concern at the moment, but I expect them to increase in size as I progress.
Potential solution:
I am not sure if this already exists, or if there is support for it, I would like to potentially use fp16 for everything, or mixed-precision - this will help reduce GPU memory consumption leaving space for any other operations happening in parallel, and potentially speed things up as well.
When I try to wrap
project_gaussians
aroundtorch.autocast
, I get errors likeExpected Float, got Half
. I am using an older versionv0.1.11
so not sure if this is the problem there.I am interesting in working on this and helping build a solution - looking for some guidance, though. Open to other solutions to this problem as well. Curious to hear what the team thinks.
The text was updated successfully, but these errors were encountered: