utils

Quantization utilities.

Functions

reduce_amax

Compute the absolute maximum value of a tensor.

is_quantized

Check if a module is quantized.

is_quantized_layer_with_weight

Check if a module is quantized with weights.

is_quantized_column_parallel_linear

Check if a module is a quantized column parallel linear module.

is_quantized_row_parallel_linear

Check if a module is a quantized row parallel linear module.

replace_function

Replace a function with a new one within a context.

export_torch_mode

Context manager enabling the export mode.

is_torch_library_supported

Check if the installed PyTorch version meets or exceeds a specified version.

get_parallel_state

Get the parallel state.

export_torch_mode()

Context manager enabling the export mode.

get_parallel_state(model, name=None)

Get the parallel state.

Parameters:
  • model – Pytorch model.

  • name – The name of the submodule of the model to get the parallel state from. If None, the parallel state of the model is returned.

Return type:

ParallelState

is_quantized(module)

Check if a module is quantized.

is_quantized_column_parallel_linear(module)

Check if a module is a quantized column parallel linear module.

is_quantized_layer_with_weight(module)

Check if a module is quantized with weights.

is_quantized_row_parallel_linear(module)

Check if a module is a quantized row parallel linear module.

is_torch_library_supported()

Check if the installed PyTorch version meets or exceeds a specified version.

reduce_amax(input, axis=None, keepdims=True)

Compute the absolute maximum value of a tensor.

Reduces input_tensor along the dimensions given in axis. Unless keepdims is true, the rank of the tensor is reduced by 1 for each entry in axis. If keepdims is true, the reduced dimensions are retained with length 1.

Note

Gradient computation is disabled as this function is never meant learning reduces amax

Parameters:
  • input – Input tensor

  • axis – The dimensions to reduce. None or int or tuple of ints. If None (the default), reduces all dimensions. Must be in the range [-rank(input_tensor), rank(input_tensor)).

  • keepdims – A boolean. If true, retains reduced dimensions with length 1. Default True

  • granularity – DEPRECTED. specifies if the statistic has to be calculated at tensor or channel granularity

Returns:

The reduced tensor.

Raises:
  • ValueError – Any axis which doesn’t make sense or is not supported

  • ValueError – If unknown granularity is passed in.

replace_function(package, name, new_func)

Replace a function with a new one within a context.