Core Transform APIs#
Common Transform Configuration#
Most transforms accept these common fields. Stage pages also show transform-specific configuration models when a transform extends this base configuration.
- pydantic model tensorrt_llm._torch.auto_deploy.transform.interface.TransformConfig[source]
Bases:
BaseModelA simple configuration class that can be extended by a transform for configurability.
Show JSON schema
{ "title": "TransformConfig", "description": "A simple configuration class that can be extended by a transform for configurability.", "type": "object", "properties": { "stage": { "$ref": "#/$defs/Stages", "description": "The stage of the transformation pipeline where this transform should run." }, "run_per_gm": { "default": true, "description": "Whether to run the transform per graph (sub)module or on whole module.", "title": "Run Per Gm", "type": "boolean" }, "enabled": { "default": true, "description": "Whether to enable this transform.", "title": "Enabled", "type": "boolean" }, "skip_on_error": { "default": false, "description": "Whether to skip the transform if an error occurs.", "title": "Skip On Error", "type": "boolean" }, "run_graph_cleanup": { "default": true, "description": "Whether to run graph cleanup/canonicalization after this transform.", "title": "Run Graph Cleanup", "type": "boolean" }, "run_shape_prop": { "default": false, "description": "Whether to run shape propagation after this transform.", "title": "Run Shape Prop", "type": "boolean" }, "requires_clean_graph": { "default": true, "description": "Whether this transform requires the graph to be clean before it is applied.", "title": "Requires Clean Graph", "type": "boolean" }, "requires_shape_prop": { "default": false, "description": "Whether this transform requires shape propagation before it is applied.", "title": "Requires Shape Prop", "type": "boolean" }, "debug_visualize_dir": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "description": "Debug visualization directory. None to disable visualization, or a path string to specify the output directory.", "title": "Debug Visualize Dir" }, "expect_mem_change": { "default": false, "description": "Whether this transform is expected to cause changes in CUDA memory stats.", "title": "Expect Mem Change", "type": "boolean" } }, "$defs": { "Stages": { "description": "Enumerated (ordered!) stages of the transformation pipeline.\n\nThis is used to classify and pre-order transforms.", "enum": [ "factory", "export", "post_export", "pattern_matcher", "sharding", "weight_load", "post_load_fusion", "cache_init", "visualize", "compile" ], "title": "Stages", "type": "string" } }, "additionalProperties": true, "required": [ "stage" ] }
- Config:
extra: str = allow
- Fields:
- field debug_visualize_dir: str | None = None
Debug visualization directory. None to disable visualization, or a path string to specify the output directory.
- field enabled: bool = True
Whether to enable this transform.
- field expect_mem_change: bool = False
Whether this transform is expected to cause changes in CUDA memory stats.
- field requires_clean_graph: bool = True
Whether this transform requires the graph to be clean before it is applied.
- field requires_shape_prop: bool = False
Whether this transform requires shape propagation before it is applied.
- field run_graph_cleanup: bool = True
Whether to run graph cleanup/canonicalization after this transform.
- field run_per_gm: bool = True
Whether to run the transform per graph (sub)module or on whole module.
- field run_shape_prop: bool = False
Whether to run shape propagation after this transform.
- field skip_on_error: bool = False
Whether to skip the transform if an error occurs.
- field stage: Stages [Required]
The stage of the transformation pipeline where this transform should run.
Transform Interface#
The interface for all transforms.
This module defines the base classes and interfaces for all transforms.
- class tensorrt_llm._torch.auto_deploy.transform.interface.MemStats(
- tot: float,
- free: float,
- resv: float,
- alloc: float,
- frag: float,
Bases:
objectMemory statistics snapshot for tracking CUDA memory usage.
- tot: float#
- free: float#
- resv: float#
- alloc: float#
- frag: float#
- exception tensorrt_llm._torch.auto_deploy.transform.interface.TransformError[source]#
Bases:
ExceptionAn exception raised when a transform fails.
- class tensorrt_llm._torch.auto_deploy.transform.interface.Stages(
- value,
- names=<not given>,
- *values,
- module=None,
- qualname=None,
- type=None,
- start=1,
- boundary=None,
Bases:
EnumEnumerated (ordered!) stages of the transformation pipeline.
This is used to classify and pre-order transforms.
- FACTORY = 'factory'#
- EXPORT = 'export'#
- POST_EXPORT = 'post_export'#
- PATTERN_MATCHER = 'pattern_matcher'#
- SHARDING = 'sharding'#
- WEIGHT_LOAD = 'weight_load'#
- POST_LOAD_FUSION = 'post_load_fusion'#
- CACHE_INIT = 'cache_init'#
- VISUALIZE = 'visualize'#
- COMPILE = 'compile'#
Bases:
BaseModelGlobal config shared between multiple transforms in the inference optimizer.
Show JSON schema
{ "title": "SharedConfig", "description": "Global config shared between multiple transforms in the inference optimizer.", "type": "object", "properties": { "local_rank": { "default": 0, "title": "Local Rank", "type": "integer" }, "world_size": { "default": 1, "title": "World Size", "type": "integer" }, "dist_config": { "anyOf": [ { "$ref": "#/$defs/DistConfig" }, { "type": "null" } ], "default": null } }, "$defs": { "DistConfig": { "additionalProperties": true, "description": "Distributed parallelism configuration for AutoDeploy.", "properties": { "world_size": { "default": 1, "minimum": 1, "title": "World Size", "type": "integer" }, "rank": { "default": 0, "minimum": 0, "title": "Rank", "type": "integer" }, "tp_size": { "default": 1, "minimum": 1, "title": "Tp Size", "type": "integer" }, "pp_size": { "default": 1, "minimum": 1, "title": "Pp Size", "type": "integer" }, "moe_tp_size": { "default": 1, "minimum": 1, "title": "Moe Tp Size", "type": "integer" }, "moe_ep_size": { "default": 1, "minimum": 1, "title": "Moe Ep Size", "type": "integer" }, "moe_cluster_size": { "default": 1, "minimum": 1, "title": "Moe Cluster Size", "type": "integer" }, "enable_attention_dp": { "default": false, "title": "Enable Attention Dp", "type": "boolean" }, "allreduce_strategy": { "default": "NCCL", "title": "Allreduce Strategy", "type": "string" } }, "title": "DistConfig", "type": "object" } }, "additionalProperties": true }
- Config:
extra: str = allow
arbitrary_types_allowed: bool = True
- Fields:
- pydantic model tensorrt_llm._torch.auto_deploy.transform.interface.TransformConfig[source]#
Bases:
BaseModelA simple configuration class that can be extended by a transform for configurability.
Show JSON schema
{ "title": "TransformConfig", "description": "A simple configuration class that can be extended by a transform for configurability.", "type": "object", "properties": { "stage": { "$ref": "#/$defs/Stages", "description": "The stage of the transformation pipeline where this transform should run." }, "run_per_gm": { "default": true, "description": "Whether to run the transform per graph (sub)module or on whole module.", "title": "Run Per Gm", "type": "boolean" }, "enabled": { "default": true, "description": "Whether to enable this transform.", "title": "Enabled", "type": "boolean" }, "skip_on_error": { "default": false, "description": "Whether to skip the transform if an error occurs.", "title": "Skip On Error", "type": "boolean" }, "run_graph_cleanup": { "default": true, "description": "Whether to run graph cleanup/canonicalization after this transform.", "title": "Run Graph Cleanup", "type": "boolean" }, "run_shape_prop": { "default": false, "description": "Whether to run shape propagation after this transform.", "title": "Run Shape Prop", "type": "boolean" }, "requires_clean_graph": { "default": true, "description": "Whether this transform requires the graph to be clean before it is applied.", "title": "Requires Clean Graph", "type": "boolean" }, "requires_shape_prop": { "default": false, "description": "Whether this transform requires shape propagation before it is applied.", "title": "Requires Shape Prop", "type": "boolean" }, "debug_visualize_dir": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "description": "Debug visualization directory. None to disable visualization, or a path string to specify the output directory.", "title": "Debug Visualize Dir" }, "expect_mem_change": { "default": false, "description": "Whether this transform is expected to cause changes in CUDA memory stats.", "title": "Expect Mem Change", "type": "boolean" } }, "$defs": { "Stages": { "description": "Enumerated (ordered!) stages of the transformation pipeline.\n\nThis is used to classify and pre-order transforms.", "enum": [ "factory", "export", "post_export", "pattern_matcher", "sharding", "weight_load", "post_load_fusion", "cache_init", "visualize", "compile" ], "title": "Stages", "type": "string" } }, "additionalProperties": true, "required": [ "stage" ] }
- Config:
extra: str = allow
- Fields:
- field debug_visualize_dir: str | None = None#
Debug visualization directory. None to disable visualization, or a path string to specify the output directory.
- field enabled: bool = True#
Whether to enable this transform.
- field expect_mem_change: bool = False#
Whether this transform is expected to cause changes in CUDA memory stats.
- field requires_clean_graph: bool = True#
Whether this transform requires the graph to be clean before it is applied.
- field requires_shape_prop: bool = False#
Whether this transform requires shape propagation before it is applied.
- field run_graph_cleanup: bool = True#
Whether to run graph cleanup/canonicalization after this transform.
- field run_per_gm: bool = True#
Whether to run the transform per graph (sub)module or on whole module.
- field run_shape_prop: bool = False#
Whether to run shape propagation after this transform.
- field skip_on_error: bool = False#
Whether to skip the transform if an error occurs.
- pydantic model tensorrt_llm._torch.auto_deploy.transform.interface.TransformInfo[source]#
Bases:
BaseModelInformation about the result of a transform.
Show JSON schema
{ "title": "TransformInfo", "description": "Information about the result of a transform.", "type": "object", "properties": { "skipped": { "default": true, "description": "Whether the transform was skipped.", "title": "Skipped", "type": "boolean" }, "num_matches": { "default": 0, "description": "Number of matches found.", "title": "Num Matches", "type": "integer" }, "is_clean": { "default": false, "description": "Whether the graph is clean after the transform. This can be set by the transform to indicate that the transform does not change the graph and it preserves the is_clean flag of the last transform.", "title": "Is Clean", "type": "boolean" }, "has_valid_shapes": { "default": false, "description": "Whether meta tensor shapes are valid after the transform. This can be set by the transform to indicate that the transform does not affect the shapes in the meta information of the graph. In other words, the transform does not change the shapes of the tensors in the graph and it preserves the has_valid_shapes flag of the last transform.", "title": "Has Valid Shapes", "type": "boolean" } } }
- Config:
frozen: bool = True
- Fields:
- field has_valid_shapes: bool = False#
Whether meta tensor shapes are valid after the transform. This can be set by the transform to indicate that the transform does not affect the shapes in the meta information of the graph. In other words, the transform does not change the shapes of the tensors in the graph and it preserves the has_valid_shapes flag of the last transform.
- field is_clean: bool = False#
Whether the graph is clean after the transform. This can be set by the transform to indicate that the transform does not change the graph and it preserves the is_clean flag of the last transform.
- field num_matches: int = 0#
Number of matches found.
- field skipped: bool = True#
Whether the transform was skipped.
- classmethod from_last_info(
- info: TransformInfo,
Create a new TransformInfo from the last transform info.
- tensorrt_llm._torch.auto_deploy.transform.interface.with_transform_logging(call_fn: Callable) Callable[source]#
Decorator to prepend transform-specific prefix to all ad_logger logs during __call__.
Temporarily patches ad_logger.log so that any logs emitted within the call automatically include the [stage=…, transform=…] prefix that _log_info would otherwise add manually. The original logger behavior is restored after the call, even if an exception occurs.
- class tensorrt_llm._torch.auto_deploy.transform.interface.BaseTransform(
- config: TransformConfig,
Bases:
ABCA base class for all transforms.
- classmethod get_transform_key() str[source]#
Get the short name of the transform.
This is used to identify the transform in the transformation pipeline.
- classmethod get_config_class() Type[TransformConfig][source]#
Get the configuration class for the transform.
This is used to validate the configuration of the transform.
- config: TransformConfig#
- final classmethod from_kwargs(
- **kwargs,
Create a transform from kwargs.
- Parameters:
**kwargs – The configuration for the transform.
- Returns:
The transform instance.
- class tensorrt_llm._torch.auto_deploy.transform.interface.TransformRegistry[source]#
Bases:
objectA registry for all transforms.
- classmethod register(
- name: str,
- classmethod get(
- name: str,
Get the transform class by name.
- classmethod get_config_class(
- name: str,
Get the configuration class for a transform by name.
Optimizer#
High-level entrypoint to transform a model into an efficient inference model.
- class tensorrt_llm._torch.auto_deploy.transform.optimizer.InferenceOptimizer(
- factory: ModelFactory,
- config: Mapping[str, TransformConfig | Dict[str, Any]],
- dist_config: DistConfig | None = None,
Bases:
object
Graph Module Visualizer#
PyTorch GraphModule Visualization Tool
This module provides functionality to convert PyTorch GraphModule to Graphviz diagrams. Supports different node styles and detailed graph annotations.
Key Features: - Convert FX GraphModule to Graphviz diagrams - Display tensor shape information on edges - Adjust edge width based on tensor element count - Intelligent port assignment for multi-input/output handling - Color coding based on tensor identity
- Usage Example:
import torch import torch.fx as fx from graph_module_visualizer import to_dot
# Trace model model = YourModel() traced = fx.symbolic_trace(model)
# Generate visualization dot = to_dot(traced, format=”svg”, include_shapes=True)
Requirements: pip install graphviz
- tensorrt_llm._torch.auto_deploy.transform.graph_module_visualizer.to_dot(
- graph_module: GraphModule,
- name: str,
- save_path: str,
- format: str = 'svg',
- include_shapes: bool = True,
Convert PyTorch GraphModule to Graphviz diagram
- Parameters:
graph_module – GraphModule to visualize
name – Name of the diagram
save_path – Save path, if None uses name
format – Output format (‘png’, ‘pdf’, ‘svg’, ‘dot’, etc.)
include_shapes – Whether to include tensor shape information
- Returns:
graphviz.Digraph object