dequantize

tripy.dequantize(input: Tensor, scale: Tensor | Number | Sequence[Number] | Sequence[Sequence[Number]], dtype: dtype, dim: int | Any = None) Tensor[source]

Dequantizes the input tensor.

If dim is not given, this function will perform “per-tensor” or “block-wise” dequantization.

  • For “per-tensor” dequantization, the scale must be a scalar tensor or a single python number.

  • For “block-wise” dequantization, the dtype must only be tripy.int4. The input tensor must only have 2 dimensions, e.g. [D0, D1]. The scale must also be a 2-D tensor or a 2-D python sequence. The first dimension of scale must be able to divide D0, where “blocking” is performed. The second dimension of scale must equal to D1.

If dim is given, this function will perform “per-channel” dequantization. The scale must be a 1-D tensor or a python sequence both with size of input.shape[dim].

Parameters:
  • input (Tensor) – [dtype=T1] The input tensor with a valid quantized data type.

  • scale (Tensor | Number | Sequence[Number] | Sequence[Sequence[Number]]) – [dtype=T2] The scale tensor. Must be a constant tensor.

  • dtype (dtype) – [dtype=T2] The data type after dequantization. Must be tripy.float32 or tripy.float16.

  • dim (int | Any) – The dimension for per-channel dequantization

Returns:

[dtype=T2] The dequantized tensor.

Return type:

Tensor

TYPE CONSTRAINTS:
Example: Per-tensor dequantization
Per-tensor dequantization
1input = tp.Tensor([1, 2, 3], dtype=tp.int8)
2scale = 0.99872
3output = tp.dequantize(input, scale, tp.float32)
>>> input
tensor([1, 2, 3], dtype=int8, loc=gpu:0, shape=(3,))
>>> output
tensor([0.9987, 1.9974, 2.9962], dtype=float32, loc=gpu:0, shape=(3,))
Example: Per-channel dequantization
Per-channel dequantization
1input = tp.Tensor([[1, 2, 3], [4, 5, 6]], dtype=tp.int8)
2scale = [0.99872, 0.96125]
3output = tp.dequantize(input, scale, tp.float32, dim=0)
>>> input
tensor(
    [[1, 2, 3],
     [4, 5, 6]], 
    dtype=int8, loc=gpu:0, shape=(2, 3))
>>> output
tensor(
    [[0.9987, 1.9974, 2.9962],
     [3.8450, 4.8063, 5.7675]], 
    dtype=float32, loc=gpu:0, shape=(2, 3))
Example: Block-wise dequantization
Block-wise dequantization
1input = tp.Tensor([[0, 1], [2, 3]], dtype=tp.float32)
2scale = [[1.0, 1.0]]
3quant = tp.quantize(input, scale, tp.int4)
4output = tp.dequantize(quant, scale, tp.float32)
>>> output
tensor(
    [[0.0000, 1.0000],
     [2.0000, 3.0000]], 
    dtype=float32, loc=gpu:0, shape=(2, 2))

See also

quantize()