Synchronize quantizer setups for DistributedDataParallel cases #431

vshampor · 2021-01-22T09:00:32Z

Now that the quantizer setup is being decided during create_compressed_model, and for precision init cases the resulting setup is dependent on the data loaders used for initialization, there is a possibility for DDP that each process may receive significantly different data values, and then compute a different quantizer setup each; since the entire quantizer setup is not technically a torch.Tensor, it cannot be broadcasted to all processes using PyTorch facilities.
A special tensor-only synchronization object is required so that the precision init (determining the quantizer setup) only happens in one process of the DDP group, and then the resulting quantizer setup is broadcasted to other processes in the group.

The text was updated successfully, but these errors were encountered:

fxmarty · 2023-04-18T11:40:26Z

Hi @vshampor , was this implemented in nncf?

vshampor self-assigned this Feb 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Synchronize quantizer setups for DistributedDataParallel cases #431

Synchronize quantizer setups for DistributedDataParallel cases #431

vshampor commented Jan 22, 2021

fxmarty commented Apr 18, 2023

Synchronize quantizer setups for DistributedDataParallel cases #431

Synchronize quantizer setups for DistributedDataParallel cases #431

Comments

vshampor commented Jan 22, 2021

fxmarty commented Apr 18, 2023