-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update from bevy 0.12 to 0.14.2 #39
base: main
Are you sure you want to change the base?
Conversation
-- Type naming changes -- Some fairly large changes with separation of graphics world and game world. -- New colours. - WGPU became more fussy: -- Avoiding extreme float values NaN, Inf. -- had to change prim raster buffer packing arrangement 64 bytes to 32 bytes (Rgba32Float x 4 -> Rgba32Float x 1 + Rgba16Float x 2). -- Seems to be more strict about padding and format of textures. - Attempt to avoid type confusion between glam types and spir_v::glam, with conversion where necessary. - updated requirement to nightly-2024-04-24 for rust-gpu. -- Unfortunately, this results in a stubborn: error[E0658]: inline-const is experimental Which I found a work around to get a local version of bevy and add #![feature(inline_const)] to the crate root.
… does involve patching all bevy dependencies to add one line. This might be palatable because it is a similar style to the removed naga patch. - Fixed one last example of newer WGPU versions being fussy with u8 cast as u32.
Amazing, thanks for preparing the changes! I'm a bit busy at the moment, but I should be able to take a look in a couple of days 🙂 |
Thanks! I'm not sure if this is something that makes sense on Github, but maybe this would be best left as a branch for now. I noticed this issue: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks so much for preparing the update - I know things like those can get mundane, especially with the long compilation times, and you did a great job 🙂
I've taken a look and while I haven't reviewed everything yet, I think it's ready for the first round!
One important difference is that the renders are now much dimmer, in both the path-traced and restir-ed modes - before (current main):
... after:
I think this might be related to changes in gbuffer.rs
- base_color
is now encoded in 16-bits and the gamma correction is gone (the .powf(1.0 / 2.0)
thingie), could you take a look at that?
Thanks & cheers;
.map(|v| v.physical_position) | ||
.unwrap_or_default(); | ||
|
||
let fov_y = std::f32::consts::FRAC_PI_4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's with the manual calculations here? 🤔
We can continue to reuse the matrix from Bevy, it's just that the function name changed from projection.get_projection_matrix()
to projection.get_clip_from_view()
(bevyengine/bevy#13489).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I found it just didn't work using the bevy projection, and you are correct this is lazy, so I tried to figure out the reason.
What I'm thinking now is it comes down to the implementation of get_clip_from_view:
fn get_clip_from_view(&self) -> Mat4 {
Mat4::perspective_infinite_reverse_rh(self.fov, self.aspect_ratio, self.near)
}
The "infinite" bit seems to cause a problem and my guess is it ruins the depth mapping down the line. I have changed it to a half way solution where I take the values from the bevy camera and manually run perspective_rh:
let fov_y = p_projection.fov;
let aspect_ratio = p_projection.aspect_ratio;
let near_plane = p_projection.near;
let far_plane = p_projection.far;
let p_mat = Mat4::perspective_rh(fov_y, aspect_ratio, far_plane, near_plane);
I could have misunderstood something, let me know!
@@ -125,7 +140,7 @@ impl Light { | |||
} | |||
|
|||
pub fn clear_slot(&mut self) { | |||
self.d3.x = f32::from_bits(0); | |||
self.d3.x = f32::from_bits(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔
@@ -50,7 +65,7 @@ impl Light { | |||
Self { | |||
// TODO incorrect | |||
d0: position.extend(25.0), | |||
d1: color.extend(f32::INFINITY), | |||
d1: color.extend(1000_000f32), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not f32::INFINITY
? (including the other places)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My reasoning is this temperamental wgpu run time check. To be clear, setting this particular case back to f32::INFINITY doesn't cause any errors, but elsewhere it does and I'm not sure about the specific rules. For example, in another place changing the large number to:
impl TriangleHit {
pub fn none() -> Self {
Self {
distance: f32::INFINITY,
point: Default::default(),
normal: Default::default(),
uv: Default::default(),
material_id: MaterialId::new(0),
}
}
Leads to this error:
Shader validation error:
┌─ C:\...\bvh_heatmap-main.spv:1:1
│
1 │
│ naga::Expression [23]
Constant expression [23] is invalid
Float literal is infinite
I'm just being consistent not using f32::INFINITY or f32::NaN anywhere in strolle-gpu, as wgpu seems to randomly complain on a case by case basis maybe depending on the spir-v generated? I don't know.
@@ -41,6 +41,21 @@ pub struct Light { | |||
pub prev_d2: Vec4, | |||
} | |||
|
|||
impl Default for Light { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reasoning behind this change? 👀
The comments mention altering the defaults to non-zero radius and range, but considering the place this method is called (strolle/src/lights.rs:111
), I would actually expect it to generate an empty light (with zero radius and range).
Using all zeros for defaults also has a nice side effect of allowing the compiler to use memset
instead of memcpy
when allocating larger arrays - for instance, if you preview the assembly for https://play.rust-lang.org/?version=nightly&mode=release&edition=2021&gist=2bd5c339dab8cdb05c47d3b38d2649c5 you'll see a simple callq *memset@GOTPCREL(%rip)
, but changing the default range to 32.0
requires compiler to generate an explicit loop, which might be slower.
Those zero-defaults don't matter here for performance, I just thought it'll be nice to mention it!
let emissive_g_u = (self.emissive.y.clamp(0.0, 1.0) * 255.0).round() as u32; | ||
let emissive_b_u = (self.emissive.z.clamp(0.0, 1.0) * 255.0).round() as u32; | ||
|
||
let emissive_packed = (emissive_g_u << 8) | emissive_b_u; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the second gbuffer texture is now Rgba16Float
, this is invalid and will cause the read values to be different:
#![feature(f16)]
fn main() {
// Shader A generates the values and stores them:
let emis_g = 50;
let emis_b = 25;
let out = f32::from_bits((emis_g << 8) | emis_b);
// When shaders stores pixels into the texture, GPU automatically converts
// the result to f16 (as specified by the texture format):
let out = out as f16;
// Shader B now loads the values back again, but whoops - they are
// different!
let out = (out as f32).to_bits();
let emis_b = out & 0xff;
let emis_g = (out >> 8) & 0xff;
dbg!(emis_b, emis_g);
}
That's because using f32::from_bits()
etc. here is just a trick to pass arbitrary numbers between consecutive shaders - this works, because when you're writing into a 32-bit data texture, the GPU doesn't care about the contents you're writing, it's all just some data.
But the moment you use a 16-bit texture, you force the GPU to interpret the written numbers strictly as floating-point values, which renders (heh) floating point tricks futile, since casting f32
to f16
preserves value (e.g. 1.23 -> 1.23
), not the underlying bit pattern.
We should either use two 32-bit textures or find another way to encode this information, one that doesn't rely on bit tricks.
base_color.w, | ||
])) | ||
}; | ||
let base_color = self.base_color.clamp(Vec4::ZERO, Vec4::ONE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you've kinda swapped base_color
and emissive
in here, but doing so actually reduces the quality.
base_color
comes from the triangle's texture, which is always a 24-bit image, one that has 8 bits for each channel (see strolle/src/images.rs:75
) - so encoding this 24-bit base_color
in 3 x f32 = 96 bits is just wasteful.
Emissive, on the other hand, is a different story - while base_color
remains in range 0..255
(or well, 0..=1
if you consider it "float"ed), an emissive color can be arbitrarily large. Rightfully so, emissives are basically lights and you wouldn't cap light's intensity, it can be as bright as it gets!
That's why the original implementation here stored "quantized" base_color
(which wasn't really quantized, since the underlying data is 24-bit anyway) , but it kept emissive
as-is - it's because it's alright for an emissive color to be e.g. rgb(500, 500, 500)
- that'd be just a bright light.
(while it doesn't make sense for base_color
to be rgb(500, 500, 500)
, since base_color
doesn't emit light - base_color
basically tells how much light gets reflected, so it can't end up generating light and thus must remain capped between 0..=1
, aka 0..255
.)
The code before said // TODO doesn't need to use as much space
because there's a clever way of encoding emissives - the rgb9e5 format (used e.g. by Kajiya). If you'd like, you can try to take a stab at it 🙂
The most important thing about rgb9e5 and similar formats is that they try to quantize chromaticity, but preserve luminance - intuitively, for a packed (quantized) emissive color we care more about it's "strength" rather than its "chroma" and we can exploit this fact when quantizing it.
u might need to move this to 0.15 now |
Hi, @glasspangolin - how are things going, do you need some extra help / more time / something's unclear? 🙂 |
Key observations of the commit are:
I hope this is ok! Feel free to tear this apart as it's my first contribution to a repo. I want to learn and I'm happy to go back to the drawing board!