utils

Quantization utilities.

Functions

`convert_quantization_axis_to_reduce_axis`	Convert the quantization axis to the reduce axis.
`export_torch_mode`	Context manager enabling the export mode.
`is_quantized`	Check if a module is quantized.
`is_quantized_column_parallel_linear`	Check if a module is a quantized column parallel linear module.
`is_quantized_layer_with_weight`	Check if a module is quantized with weights.
`is_quantized_linear`	Check if a module is a quantized linear module.
`is_quantized_row_parallel_linear`	Check if a module is a quantized row parallel linear module.
`reduce_amax`	Compute the absolute maximum value of a tensor.
`replace_function`	Replace a function with a new one within a context.

convert_quantization_axis_to_reduce_axis(input, axis)

Convert the quantization axis to the reduce axis.

Parameters:

input (torch.Tensor) – The input tensor.
axis (int, tuple, list of None) – The quantization axis. None means per-tensor quantization.

Returns:

The axis to reduce. None suggests all dimensions should be reduced.

Return type:

list

export_torch_mode(): Context manager enabling the export mode.

is_quantized(module): Check if a module is quantized.

is_quantized_column_parallel_linear(module): Check if a module is a quantized column parallel linear module.

is_quantized_layer_with_weight(module): Check if a module is quantized with weights.

is_quantized_linear(module): Check if a module is a quantized linear module.

is_quantized_row_parallel_linear(module): Check if a module is a quantized row parallel linear module.

reduce_amax(input, axis=None, keepdims=True, squeeze_scalar=True)

Compute the absolute maximum value of a tensor.

Reduces input_tensor along the dimensions given in axis. Unless keepdims is true, the rank of the tensor is reduced by 1 for each entry in axis. If keepdims is true, the reduced dimensions are retained with length 1.

Note

Gradient computation is disabled as this function is never meant learning reduces amax

Parameters:

input – Input tensor
axis – The dimensions to reduce. None or int or tuple of ints. If None (the default), reduces all dimensions. Must be in the range [-rank(input_tensor), rank(input_tensor)).
keepdims – A boolean. If true, retains reduced dimensions with length 1. Default True

Returns:

The reduced tensor.

replace_function(package, name, new_func): Replace a function with a new one within a context.