API Docs¶
Init module for MCT API.
import model_compression_toolkit as mct
ptq¶
pytorch_post_training_quantization: A function to use for post training quantization of PyTorch models.
keras_post_training_quantization: A function to use for post training quantization of Keras models.
gptq¶
pytorch_gradient_post_training_quantization: A function to use for gradient-based post training quantization of Pytorch models.
get_pytorch_gptq_config: A function to create a GradientPTQConfig instance to use for Pytorch models when using GPTQ.
keras_gradient_post_training_quantization: A function to use for gradient-based post training quantization of Keras models.
get_keras_gptq_config: A function to create a GradientPTQConfig instance to use for Keras models when using GPTQ.
GradientPTQConfig: Class to configure GradientPTQ options for gradient based post training quantization.
qat¶
pytorch_quantization_aware_training_init_experimental: A function to use for preparing a Pytorch model for Quantization Aware Training (experimental).
pytorch_quantization_aware_training_finalize_experimental: A function to finalize a Pytorch model after Quantization Aware Training to a model without QuantizeWrappers (experimental).
keras_quantization_aware_training_init_experimental: A function to use for preparing a Keras model for Quantization Aware Training (experimental).
keras_quantization_aware_training_finalize_experimental: A function to finalize a Keras model after Quantization Aware Training to a model without QuantizeWrappers (experimental).
qat_config: Module to create quantization configuration for Quantization-aware Training (experimental).
core¶
CoreConfig: Module to contain configurations of the entire optimization process.
QuantizationConfig: Module to configure the quantization process.
QuantizationErrorMethod: Select a method for quantization parameters’ selection.
MixedPrecisionQuantizationConfig: Module to configure the quantization process when using mixed-precision PTQ.
BitWidthConfig: Module to configure the bit-width manually.
ResourceUtilization: Module to configure resources to use when searching for a configuration for the optimized model.
network_editor: Module to modify the optimization process for troubleshooting.
pytorch_resource_utilization_data: A function to compute Resource Utilization data that can be used to calculate the desired target resource utilization for PyTorch models.
keras_resource_utilization_data: A function to compute Resource Utilization data that can be used to calculate the desired target resource utilization for Keras models.
data_generation¶
pytorch_data_generation_experimental: A function to generate data for a Pytorch model (experimental).
get_pytorch_data_generation_config: A function to load a DataGenerationConfig for Pytorch data generation (experimental).
keras_data_generation_experimental: A function to generate data for a Keras model (experimental).
get_keras_data_generation_config: A function to generate a DataGenerationConfig for Tensorflow data generation (experimental).
DataGenerationConfig: A configuration class for the data generation process (experimental).
pruning¶
pytorch_pruning_experimental: A function to apply structured pruning for Pytorch models (experimental).
keras_pruning_experimental: A function to apply structured pruning for Keras models (experimental).
PruningConfig: Configuration for the pruning process (experimental).
PruningInfo: Information about the pruned model such as pruned channel indices, etc. (experimental).
xquant¶
xquant_report_pytorch_experimental: A function to generate an explainable quantization report for a quantized Pytorch model (experimental).
xquant_report_troubleshoot_pytorch_experimental: A function to generate an explainable quantization report, detect degraded layaers and judge degrade causes for a quantized Pytorch model. (experimental).
xquant_report_keras_experimental: A function to generate an explainable quantization report for a quantized Keras model (experimental).
XQuantConfig: Configuration for the XQuant report (experimental).
exporter¶
exporter: Module that enables to export a quantized model in different serialization formats.
trainable_infrastructure¶
trainable_infrastructure: Module that contains quantization abstraction and quantizers for hardware-oriented model optimization tools.
set_log_folder¶
set_log_folder: Function to set the logger path directory and to enable logging.
keras_load_quantized_model¶
keras_load_quantized_model: A function to load a quantized keras model.
target_platform_capabilities¶
target_platform_capabilities: Module to create and model hardware-related settings to optimize the model according to, by the hardware the optimized model will use during inference.
get_target_platform_capabilities: A function to get a target platform model for Tensorflow and Pytorch.
DefaultDict: Util class for creating a TargetPlatformCapabilities.
wrapper¶
wrapper: Util class for the Model Compression Toolkit (MCT) wrapper API. This class enables users to use MCT easily, without the need to know its specifications.
Indices and tables¶
Note
This documentation is auto-generated using Sphinx