Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
RaymondWang0 authored Oct 23, 2023
1 parent e4b277d commit de720b4
Showing 1 changed file with 1 addition and 2 deletions.
3 changes: 1 addition & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ Running large language models (LLMs) on the edge is useful: copilot services (co

This is enabled by LLM model compression technique: [SmoothQuant](https://github.com/mit-han-lab/smoothquant) and [AWQ (Activation-aware Weight Quantization)](https://github.com/mit-han-lab/llm-awq), co-designed with TinyChatEngine that implements the compressed low-precision model.

Feel free to check out our [slides](assets/slides.pdf) for more details!

### Demo on an NVIDIA GeForce RTX 4070 laptop:
<table>
Expand Down Expand Up @@ -47,8 +48,6 @@ This is enabled by LLM model compression technique: [SmoothQuant](https://github
</tr>
</table>

Feel free to check out our [slides](assets/slides.pdf) for more details!


## Overview
### LLM Compression: SmoothQuant and AWQ
Expand Down

0 comments on commit de720b4

Please sign in to comment.