surgeon_utils

Utilities to surgeon ONNX graph after export.

Functions

`clear_inputs`	Clear all inputs for a node or tensor in ONNX.
`clear_outputs`	Clear all outputs for a node or tensor in ONNX.
`extract_layer_id`	Extract layer id from certain ONNX layer name.
`fold_fp8_qdq_to_dq`	Convert FP32/FP16 weights of the given ONNX model to FP8 weights.
`no_none_elements`	Check if all elements in the list are not None.

clear_inputs(node)

Clear all inputs for a node or tensor in ONNX.

Parameters:: node (Node | Tensor)

clear_outputs(node)

Clear all outputs for a node or tensor in ONNX.

Parameters:: node (Node | Tensor)

extract_layer_id(name)

Extract layer id from certain ONNX layer name.

Parameters:: name (str) – str The name of ONNX layer. e.g. /model/layer.0/q_proj/…
Returns:: The layer id for the layer as int. In the example above, it returns 0

fold_fp8_qdq_to_dq(graph)

Convert FP32/FP16 weights of the given ONNX model to FP8 weights.

Even though modelopt supports FP8 onnx export, the weights are represented in fp32 + QDQ. The storage is therefore very bad. In this function, Q nodes will get removed from the weights and have only DQ nodes with those converted FP8 weights in the output model.

Parameters:: graph (Graph) – gs.Graph.
Returns:: gs.Graph with only DQ nodes for weights and same QDQ nodes for activations.

no_none_elements(elements)

Check if all elements in the list are not None.

Parameters:: elements (list)