Using more samples in Mixed Precision quantization¶
In Mixed Precision quantization, MCT will assign a different bit width to each weight in the model, depending on the weight’s layer sensitivity and a resource constraint defined by the user, such as target model size.
Check out the mixed precision tutorial for more information.
Overview¶
By default, MCT employs 32 samples from the provided representative dataset for the Mixed Precision search. Leveraging a larger dataset could enhance results, particularly when dealing with datasets exhibiting high variance.
Trouble Situation¶
The quantization accuracy may degrade when using Mixed Precision quantization with a small number of samples.
Solution¶
Increase the number of samples (e.g. to 64 samples).
Set the num_of_images
attribute to a larger value of the MixedPrecisionQuantizationConfig
in CoreConfig
.
mixed_precision_config = mct.core.MixedPrecisionQuantizationConfig(num_of_images=64)
core_config = mct.core.CoreConfig(mixed_precision_config=mixed_precision_config)
quantized_model, _ = mct.ptq.pytorch_post_training_quantization(...,
core_config=core_config)
Note
Expanding the sample size may lead to extended runtime during the Mixed Precision search process.