surgeon_utils

Utilities to surgeon ONNX graph after export.

Functions

clear_inputs

Clear all inputs for a node or tensor in ONNX.

clear_outputs

Clear all outputs for a node or tensor in ONNX.

extract_layer_id

Extract layer id from certain ONNX layer name.

fold_fp8_qdq_to_dq

Convert FP32/FP16 weights of the given ONNX model to FP8 weights.

no_none_elements

Check if all elements in the list are not None.

clear_inputs(node)

Clear all inputs for a node or tensor in ONNX.

Parameters:

node (Node | Tensor)

clear_outputs(node)

Clear all outputs for a node or tensor in ONNX.

Parameters:

node (Node | Tensor)

extract_layer_id(name)

Extract layer id from certain ONNX layer name.

Parameters:

name (str) – str The name of ONNX layer. e.g. /model/layer.0/q_proj/…

Returns:

The layer id for the layer as int. In the example above, it returns 0

fold_fp8_qdq_to_dq(graph)

Convert FP32/FP16 weights of the given ONNX model to FP8 weights.

Even though modelopt supports FP8 onnx export, the weights are represented in fp32 + QDQ. The storage is therefore very bad. In this function, Q nodes will get removed from the weights and have only DQ nodes with those converted FP8 weights in the output model.

Parameters:

graph (Graph) – gs.Graph.

Returns:

gs.Graph with only DQ nodes for weights and same QDQ nodes for activations.

no_none_elements(elements)

Check if all elements in the list are not None.

Parameters:

elements (list)