Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HQO for scale/zero point #937

Merged
merged 3 commits into from
Sep 12, 2024
Merged

HQO for scale/zero point #937

merged 3 commits into from
Sep 12, 2024

Conversation

Giuseppe5
Copy link
Collaborator

@Giuseppe5 Giuseppe5 commented Apr 17, 2024

Implement HQO optimization for scale and zero point.
Caveats:

  • Scale does not seem to perform that well
  • Zero point supports per_tensor/per_channel/per_group (asym only)
  • Scale supports per_tensor/per_channel
  • More numerical tests are needed to evaluate the impact of HQO compared to other techniques
  • Scale implementation is INT-only
  • Zero point implementation does not make sense for FP8 since we mostly do symmetric quantization

Maybe worth to merge after #1002 and expand tests

candidate[torch.isnan(candidate)] = self.internal_candidate[torch.isnan(candidate)]
candidate = self.clamp_min_ste(candidate)
bit_width = self.msb_clamp_bit_width_impl()
int_threshold = self.int_scaling_impl(bit_width)
Copy link
Collaborator Author

@Giuseppe5 Giuseppe5 Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Int specific implementation, maybe it should be generalized to also cover minifloat case

Copy link
Collaborator

@nickfraser nickfraser Sep 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it should be generalized to also cover minifloat case

This looks like it hasn't been resolved to me. Is that true?

src/brevitas/quant/base.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@nickfraser nickfraser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intentional that this only works for Int?

candidate[torch.isnan(candidate)] = self.internal_candidate[torch.isnan(candidate)]
candidate = self.clamp_min_ste(candidate)
bit_width = self.msb_clamp_bit_width_impl()
int_threshold = self.int_scaling_impl(bit_width)
Copy link
Collaborator

@nickfraser nickfraser Sep 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it should be generalized to also cover minifloat case

This looks like it hasn't been resolved to me. Is that true?

@Giuseppe5 Giuseppe5 requested review from nickfraser and removed request for nickfraser September 12, 2024 13:36
@nickfraser nickfraser merged commit 3a9bcc6 into Xilinx:dev Sep 12, 2024
372 of 374 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
next release PRs which should be merged for the next release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants