Can AQT be used to calculate qk score? #514

Lisennlp · 2024-03-04T09:33:54Z

I see that the sample codes all talk about Attention block or MLP block. Can aqt int8 only be used for parts involving parameter calculation? For example, qk score calculation, score * V calculation, can these be used aqt int8?

lukaszlew · 2024-03-04T18:01:14Z

Yeah it can. All Einsums/DotGeneral can be quantized.
For more advanced cases (cache), one has to use QTensor and Quantizer.quant directly. We don't have an example in docs or mini-model for that at the moment.

Lisennlp · 2024-03-06T08:48:03Z

But strangely, when I used AQT INT8 for Q * K and score * V, it was slower.

`

eqn = "BTNH,BSNH->BNTS"

dot_general = aqt_utils.DenseGeneral(quant=self.quant)

logits = dot_general(eqn, query, key)`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can AQT be used to calculate qk score? #514

Can AQT be used to calculate qk score? #514

Lisennlp commented Mar 4, 2024

lukaszlew commented Mar 4, 2024

Lisennlp commented Mar 6, 2024

Can AQT be used to calculate qk score? #514

Can AQT be used to calculate qk score? #514

Comments

Lisennlp commented Mar 4, 2024

lukaszlew commented Mar 4, 2024

Lisennlp commented Mar 6, 2024