Post-Export Stage#

Post-export transforms remove low-level export artifacts and simple no-op graph patterns. This keeps later pattern-matching, sharding, and fusion passes focused on meaningful graph structure.

Cleanup No-op Slice#

Transform key: cleanup_noop_slice

Source module: tensorrt_llm._torch.auto_deploy.transform.library.cleanup_noop_slice

Configured modes: graph

class tensorrt_llm._torch.auto_deploy.transform.library.cleanup_noop_slice.CleanupNoopSlice(
config: TransformConfig,
)[source]#

Bases: BaseTransform

Remove no-op slice nodes from the graph.

Those will be nodes that are used to represent a slice operation like t[:, :5]. The graph IR will represent it as t[:][:5], i.e., two nodes and the first slice being a no-op. This function gets rid of such instances.

YAML configuration

Uses the common TransformConfig fields documented in Core Transform APIs.

Cleanup No-op Add#

Transform key: cleanup_noop_add

Source module: tensorrt_llm._torch.auto_deploy.transform.library.cleanup_noop_add

Configured modes: graph

class tensorrt_llm._torch.auto_deploy.transform.library.cleanup_noop_add.CleanupNoopAdd(
config: TransformConfig,
)[source]#

Bases: BaseTransform

Eliminate add nodes from the graph that are no-ops.

This would be any node that is just adding 0 to the input tensor. We can safely remove those.

NOTE: this function has one failure mode when the op out = tensor + zero_tensor is used in such a way that``out`` will be broadcast to the shape of zero_tensor. After removing this op then, out won’t have the right shape anymore. This should be a rare case and we can handle it when it comes up or disable this transform.

YAML configuration

Uses the common TransformConfig fields documented in Core Transform APIs.

Cleanup Input Constraints#

Transform key: cleanup_input_constraints

Source module: tensorrt_llm._torch.auto_deploy.transform.library.cleanup_input_constraints

Configured modes: graph

class tensorrt_llm._torch.auto_deploy.transform.library.cleanup_input_constraints.CleanupInputConstraints(
config: TransformConfig,
)[source]#

Bases: BaseTransform

Cleanup input constraints from the graph.

This transformations updates the input constraints of the graph. Specifically, we want to account for flattened sequences and hence the max constraint should be updated to reflect the flattened sequence length.

YAML configuration

Uses the common TransformConfig fields documented in Core Transform APIs.