You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi FlexGen team! I have a question about your quantization algorithm. are you using this function run_float_quantization for int4/int8 compression? When I run the test(test_float_quantize), it fails because the params is different with the deepspeed version(the ref_out_tensor is the same). the deepspeed param can recover the float16 tensor, the run_float_quantize can't. Thanks!
The text was updated successfully, but these errors were encountered:
Hi FlexGen team! I have a question about your quantization algorithm. are you using this function run_float_quantization for int4/int8 compression? When I run the test(test_float_quantize), it fails because the params is different with the deepspeed version(the ref_out_tensor is the same). the deepspeed param can recover the float16 tensor, the run_float_quantize can't. Thanks!
The text was updated successfully, but these errors were encountered: