API Docs¶

Init module for MCT API.

import model_compression_toolkit as mct

ptq¶

pytorch_post_training_quantization: A function to use for post training quantization of PyTorch models.
keras_post_training_quantization: A function to use for post training quantization of Keras models.

pytorch_gradient_post_training_quantization: A function to use for gradient-based post training quantization of Pytorch models.
get_pytorch_gptq_config: A function to create a GradientPTQConfig instance to use for Pytorch models when using GPTQ.
keras_gradient_post_training_quantization: A function to use for gradient-based post training quantization of Keras models.
get_keras_gptq_config: A function to create a GradientPTQConfig instance to use for Keras models when using GPTQ.
GradientPTQConfig: Class to configure GradientPTQ options for gradient based post training quantization.

pytorch_quantization_aware_training_init_experimental: A function to use for preparing a Pytorch model for Quantization Aware Training (experimental).
pytorch_quantization_aware_training_finalize_experimental: A function to finalize a Pytorch model after Quantization Aware Training to a model without QuantizeWrappers (experimental).
keras_quantization_aware_training_init_experimental: A function to use for preparing a Keras model for Quantization Aware Training (experimental).
keras_quantization_aware_training_finalize_experimental: A function to finalize a Keras model after Quantization Aware Training to a model without QuantizeWrappers (experimental).
qat_config: Module to create quantization configuration for Quantization-aware Training (experimental).

CoreConfig: Module to contain configurations of the entire optimization process.
QuantizationConfig: Module to configure the quantization process.
QuantizationErrorMethod: Select a method for quantization parameters’ selection.
MixedPrecisionQuantizationConfig: Module to configure the quantization process when using mixed-precision PTQ.
BitWidthConfig: Module to configure the bit-width manually.
ResourceUtilization: Module to configure resources to use when searching for a configuration for the optimized model.
network_editor: Module to modify the optimization process for troubleshooting.
pytorch_resource_utilization_data: A function to compute Resource Utilization data that can be used to calculate the desired target resource utilization for PyTorch models.
keras_resource_utilization_data: A function to compute Resource Utilization data that can be used to calculate the desired target resource utilization for Keras models.

pytorch_data_generation_experimental: A function to generate data for a Pytorch model (experimental).
get_pytorch_data_generation_config: A function to load a DataGenerationConfig for Pytorch data generation (experimental).
keras_data_generation_experimental: A function to generate data for a Keras model (experimental).
get_keras_data_generation_config: A function to generate a DataGenerationConfig for Tensorflow data generation (experimental).
DataGenerationConfig: A configuration class for the data generation process (experimental).

pytorch_pruning_experimental: A function to apply structured pruning for Pytorch models (experimental).
keras_pruning_experimental: A function to apply structured pruning for Keras models (experimental).
PruningConfig: Configuration for the pruning process (experimental).
PruningInfo: Information about the pruned model such as pruned channel indices, etc. (experimental).

xquant_report_pytorch_experimental: A function to generate an explainable quantization report for a quantized Pytorch model (experimental).
xquant_report_troubleshoot_pytorch_experimental: A function to generate an explainable quantization report, detect degraded layaers and judge degrade causes for a quantized Pytorch model. (experimental).
xquant_report_keras_experimental: A function to generate an explainable quantization report for a quantized Keras model (experimental).
XQuantConfig: Configuration for the XQuant report (experimental).

exporter: Module that enables to export a quantized model in different serialization formats.

trainable_infrastructure: Module that contains quantization abstraction and quantizers for hardware-oriented model optimization tools.

set_log_folder: Function to set the logger path directory and to enable logging.

target_platform_capabilities: Module to create and model hardware-related settings to optimize the model according to, by the hardware the optimized model will use during inference.
get_target_platform_capabilities: A function to get a TargetPlatformCapabilities for tpc version and device type.
get_target_platform_capabilities_sdsp: A function to get a TargetPlatformCapabilities for sdsp converter version.
DefaultDict: Util class for creating a TargetPlatformCapabilities.

wrapper: Util class for the Model Compression Toolkit (MCT) wrapper API. This class enables users to use MCT easily, without the need to know its specifications.

Note

This documentation is auto-generated using Sphinx