surgeon_utils
Utilities to surgeon ONNX graph after export.
Functions
Clear all inputs for a node or tensor in ONNX. |
|
Clear all outputs for a node or tensor in ONNX. |
|
Extract layer id from certain ONNX layer name. |
|
Convert FP32/FP16 weights of the given ONNX model to FP8 weights. |
|
Check if all elements in the list are not None. |
- clear_inputs(node)
Clear all inputs for a node or tensor in ONNX.
- Parameters:
node (Node | Tensor)
- clear_outputs(node)
Clear all outputs for a node or tensor in ONNX.
- Parameters:
node (Node | Tensor)
- extract_layer_id(name)
Extract layer id from certain ONNX layer name.
- Parameters:
name (str) – str The name of ONNX layer. e.g. /model/layer.0/q_proj/…
- Returns:
The layer id for the layer as int. In the example above, it returns 0
- fold_fp8_qdq_to_dq(graph)
Convert FP32/FP16 weights of the given ONNX model to FP8 weights.
Even though modelopt supports FP8 onnx export, the weights are represented in fp32 + QDQ. The storage is therefore very bad. In this function, Q nodes will get removed from the weights and have only DQ nodes with those converted FP8 weights in the output model.
- Parameters:
graph (Graph) – gs.Graph.
- Returns:
gs.Graph with only DQ nodes for weights and same QDQ nodes for activations.
- no_none_elements(elements)
Check if all elements in the list are not None.
- Parameters:
elements (list)