Multiplication Ops for int16 #1601

pablogranolabar · 2023-05-26T13:41:56Z

pablogranolabar
May 26, 2023

So I am in the process of releasing a research paper related to int32 and float32 representation using int16 variable space. This is a method in a similar vein as a time/memory tradeoff attack, where instead of using traditional addition-based bitwise operators, multiplication operators are used within the same int16 memory space to provide a continuous int32+ or float32+ representation at the expense of front end computational resources. Which shouldn't be such a big deal soon, given AMD's decision to expand AVX-512 acceleration primitives while Intel is shelving the same, so in theory this method could be CPU-accelerated at the tensor level and even plugged into PyTorch using an ATen subclass.

So an int16 variable describes a sequence of flags which are used with multiplication operators to represent a continuous space larger than int32/float32. The POC library will be released in a similar fashion as GNU MP Bignum, which is a multiprecision library used to wrangle with 2048+ bit large numbers for things like cryptographic key material generation.

The thought would be, to first refactor the ggml/llama weight conversion scripts to accommodate the smaller int16 representation, and then integrate the float32 functions in llama/ggml inference. And then explore the AVX-512 acceleration idea from there.

Thoughts?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiplication Ops for int16 #1601

{{title}}

Replies: 0 comments

Select a reply

Multiplication Ops for int16 #1601

pablogranolabar May 26, 2023

Replies: 0 comments

pablogranolabar
May 26, 2023