Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat (offload/fx): better buffer/params + call_functional #816

Merged
merged 1 commit into from
Jan 30, 2024

Conversation

Giuseppe5
Copy link
Collaborator

No description provided.

@Giuseppe5 Giuseppe5 merged commit 034168b into Xilinx:optimum Jan 30, 2024
22 checks passed
Giuseppe5 added a commit that referenced this pull request Feb 6, 2024
* optimum: initial optimum integration

* Refined solution for offloading

* Fix (optimum): clean-up (#802)

* Fix (optimum): dataloader and forward cleanup (#807)

* Fix (optimum): forward pass + fx (#808)

* FX forward, GPTQ, Export (#809)

* Forward pass with fx and pkv

* Restore eval

* Restore quantization

* Experimental export

* Fix GPTQ + Export

* Fix 2GB ONNX export error

* Fix gptq + speedup

* Feat (offload/fx): better buffer/params + call_functional (#816)

* Fix: typo to setting weight handlers

* Feat (optimum): better call_function FX offload (#817)

* Refactored per row quantization. JIT not working (#818)

* Better structure for QDQ weights (#822)

* Fix (export): flag for torch qcdq export (#823)

* Setup: remove optimum folder (#825)

* Add/fix comments

* Fix llm example

* Misc: pre-commit fix

* Fix (graph/equalize): new transpose interface

* Fix (examples/llm): no constant folding for group quant

---------

Co-authored-by: Nick Fraser <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant