why 2dgs model has 3 scales and the third element of scales has grad?? #468

insomniaaac · 2024-10-27T12:58:48Z

i noticed that in examples/simple_trainer_2dgs.py, 2dgs model has 3 scales.
however, in original 2dgs repo, there only exists 2 scales.

furthermore, i found that in gsplat cuda implementation, only scales[0] and scales[1] are used.
but when i print self.splats["scales"].grad, the third element has grad!

The text was updated successfully, but these errors were encountered:

Eightu · 2024-10-29T02:19:45Z

In section 4.1 of the original 2DGS, it is explained as follows: '... the scaling factors into a 3 ×3 diagonal matrix S whose last entry is zero.'.
I think maybe even if the third diagonal element of the diagonal matrix S is 0, it will still have a gradient due to the completeness of the gradient calculation and the action of the chain rule. I'm also learning.

insomniaaac · 2024-10-29T06:11:57Z

in cuda kernel

gsplat/gsplat/cuda/csrc/fully_fused_projection_2dgs_fwd.cu

Lines 19 to 44 in ec3e715

    
           template <typename T> 
        
           __global__ void fully_fused_projection_fwd_2dgs_kernel( 
        
               const uint32_t C, 
        
               const uint32_t N, 
        
               const T *__restrict__ means,    // [N, 3]:  Gaussian means. (i.e. source points) 
        
               const T *__restrict__ quats,    // [N, 4]:  Quaternions (No need to be normalized): This is the rotation component (for 2D) 
        
               const T *__restrict__ scales,   // [N, 3]:  Scales. [N, 3] scales for x, y, z 
        
               const T *__restrict__ viewmats, // [C, 4, 4]:  Camera-to-World coordinate mat 
        
                                               // [R t] 
        
                                               // [0 1] 
        
               const T *__restrict__ Ks,       // [C, 3, 3]:  Projective transformation matrix 
        
                                               // [f_x 0  c_x] 
        
                                               // [0  f_y c_y] 
        
                                               // [0   0   1]  : f_x, f_y are focal lengths, c_x, c_y is coords for camera center on screen space 
        
               const int32_t image_width,       // Image width  pixels 
        
               const int32_t image_height,      // Image height pixels 
        
               const T near_plane,              // Near clipping plane (for finite range used in z sorting) 
        
               const T far_plane,               // Far clipping plane (for finite range used in z sorting) 
        
               const T radius_clip,             // Radius clipping threshold (through away small primitives) 
        
               // outputs 
        
               int32_t *__restrict__ radii, // [C, N]   The maximum radius of the projected Gaussians in pixel unit. Int32 tensor of shape [C, N]. 
        
               T *__restrict__ means2d,     // [C, N, 2] 2D means of the projected Gaussians. 
        
               T *__restrict__ depths,      // [C, N] The z-depth of the projected Gaussians. 
        
               T *__restrict__ ray_transforms,      // [C, N, 3, 3] Transformation matrices that transform xy-planes in pixel spaces into splat coordinates (WH)^T in equation (9) in paper 
        
               T *__restrict__ normals      // [C, N, 3] The normals in camera spaces. 
        
           ) {

only scales[0] and scales[1] is used, noted that the shape of scales is [N,3]

gsplat/gsplat/cuda/csrc/fully_fused_projection_2dgs_fwd.cu

Lines 130 to 134 in ec3e715

    
           mat3<T> RS_camera = 
        
               R * quat_to_rotmat<T>(glm::make_vec4(quats)) * 
        
               mat3<T>(scales[0], 0.0      , 0.0, 
        
                       0.0      , scales[1], 0.0, 
        
                       0.0      , 0.0      , 1.0);

however, i found that in backward

gsplat/gsplat/cuda/csrc/fully_fused_projection_2dgs_bwd.cu

Lines 150 to 169 in ec3e715

    
           std::tuple<torch::Tensor, torch::Tensor, torch::Tensor, torch::Tensor> 
        
           fully_fused_projection_bwd_2dgs_tensor( 
        
               // fwd inputs 
        
               const torch::Tensor &means,    // [N, 3] 
        
               const torch::Tensor &quats,    // [N, 4] 
        
               const torch::Tensor &scales,   // [N, 2] 
        
               const torch::Tensor &viewmats, // [C, 4, 4] 
        
               const torch::Tensor &Ks,       // [C, 3, 3] 
        
               const uint32_t image_width, 
        
               const uint32_t image_height, 
        
               // fwd outputs 
        
               const torch::Tensor &radii,  // [C, N] 
        
               const torch::Tensor &ray_transforms, // [C, N, 3, 3] 
        
               // grad outputs 
        
               const torch::Tensor &v_means2d, // [C, N, 2] 
        
               const torch::Tensor &v_depths,  // [C, N] 
        
               const torch::Tensor &v_normals, // [C, N, 3] 
        
               const torch::Tensor &v_ray_transforms,  // [C, N, 3, 3] 
        
               const bool viewmats_requires_grad 
        
           ) {

scales is in [N, 2]

so v_scales is [N, 2] too, because zeroslike

gsplat/gsplat/cuda/csrc/fully_fused_projection_2dgs_bwd.cu

Line 189 in ec3e715

torch::Tensor v_scales = torch::zeros_like(scales);

however, in kernel function

gsplat/gsplat/cuda/csrc/fully_fused_projection_2dgs_bwd.cu

Lines 18 to 43 in ec3e715

    
           template <typename T> 
        
           __global__ void fully_fused_projection_bwd_2dgs_kernel( 
        
               // fwd inputs 
        
               const uint32_t C, 
        
               const uint32_t N, 
        
               const T *__restrict__ means,    // [N, 3] 
        
               const T *__restrict__ quats,    // [N, 4] 
        
               const T *__restrict__ scales,   // [N, 3] 
        
               const T *__restrict__ viewmats, // [C, 4, 4] 
        
               const T *__restrict__ Ks,       // [C, 3, 3] 
        
               const int32_t image_width, 
        
               const int32_t image_height, 
        
               // fwd outputs 
        
               const int32_t *__restrict__ radii, // [C, N] 
        
               const T *__restrict__ ray_transforms,      // [C, N, 3, 3] 
        
               // grad outputs 
        
               const T *__restrict__ v_means2d, // [C, N, 2] 
        
               const T *__restrict__ v_depths,  // [C, N] 
        
               const T *__restrict__ v_normals, // [C, N, 3] 
        
               // grad inputs 
        
               T *__restrict__ v_ray_transforms,  // [C, N, 3, 3] 
        
               T *__restrict__ v_means,   // [N, 3] 
        
               T *__restrict__ v_quats,   // [N, 4] 
        
               T *__restrict__ v_scales,  // [N, 3] 
        
               T *__restrict__ v_viewmats // [C, 4, 4] 
        
           ) {

scales is marked as [N, 3] and v_scales is [N, 3] too!
in kernel, pointer offset calculation is

gsplat/gsplat/cuda/csrc/fully_fused_projection_2dgs_bwd.cu

Line 81 in ec3e715

vec2<T> scale = glm::make_vec2(scales + gid * 3);

gsplat/gsplat/cuda/csrc/fully_fused_projection_2dgs_bwd.cu

Lines 138 to 147 in ec3e715

    
           if (warp_group_g.thread_rank() == 0) { 
        
               v_quats += gid * 4; 
        
               v_scales += gid * 3; 
        
               gpuAtomicAdd(v_quats, v_quat[0]); 
        
               gpuAtomicAdd(v_quats + 1, v_quat[1]); 
        
               gpuAtomicAdd(v_quats + 2, v_quat[2]); 
        
               gpuAtomicAdd(v_quats + 3, v_quat[3]); 
        
               gpuAtomicAdd(v_scales, v_scale[0]); 
        
               gpuAtomicAdd(v_scales + 1, v_scale[1]); 
        
           }

there v_scales offset is 3, but in fact, v_scale and scales is glm::vec2!

I am confused by these chaotic symbols

can we just modify 2dgs's api, to just pass scales in [N, 2] and align behavior with original 2dgs?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

why 2dgs model has 3 scales and the third element of scales has grad?? #468

why 2dgs model has 3 scales and the third element of scales has grad?? #468

insomniaaac commented Oct 27, 2024

Eightu commented Oct 29, 2024

insomniaaac commented Oct 29, 2024

why 2dgs model has 3 scales and the third element of scales has grad?? #468

why 2dgs model has 3 scales and the third element of scales has grad?? #468

Comments

insomniaaac commented Oct 27, 2024

Eightu commented Oct 29, 2024

insomniaaac commented Oct 29, 2024