Factory Stage#
Factory transforms create or wrap the starting model object for AutoDeploy. This stage establishes the module that later graph, weight-loading, cache, and runtime transforms will optimize.
Build Model#
Transform key: build_model
Source module: tensorrt_llm._torch.auto_deploy.transform.library.build_model
Configured modes: graph
- class tensorrt_llm._torch.auto_deploy.transform.library.build_model.BuildModel(
- config: TransformConfig,
Bases:
BaseTransformA simple wrapper transform to build a model via the model factory build_model method.
This transform will build the model via the
build_modelmethod of the model factory on the meta device (or the set device) and not load the weights.- classmethod get_config_class() Type[TransformConfig][source]#
Get the configuration class for the transform.
This is used to validate the configuration of the transform.
YAML configuration
The fields below can be set under this transform’s entry in the AutoDeploy config YAML.
- pydantic model tensorrt_llm._torch.auto_deploy.transform.library.build_model.BuildModelConfig[source]
Bases:
TransformConfigConfiguration for the build model transform.
Show JSON schema
{ "title": "BuildModelConfig", "description": "Configuration for the build model transform.", "type": "object", "properties": { "stage": { "$ref": "#/$defs/Stages", "description": "The stage of the transformation pipeline where this transform should run." }, "run_per_gm": { "default": true, "description": "Whether to run the transform per graph (sub)module or on whole module.", "title": "Run Per Gm", "type": "boolean" }, "enabled": { "default": true, "description": "Whether to enable this transform.", "title": "Enabled", "type": "boolean" }, "skip_on_error": { "default": false, "description": "Whether to skip the transform if an error occurs.", "title": "Skip On Error", "type": "boolean" }, "run_graph_cleanup": { "default": true, "description": "Whether to run graph cleanup/canonicalization after this transform.", "title": "Run Graph Cleanup", "type": "boolean" }, "run_shape_prop": { "default": false, "description": "Whether to run shape propagation after this transform.", "title": "Run Shape Prop", "type": "boolean" }, "requires_clean_graph": { "default": true, "description": "Whether this transform requires the graph to be clean before it is applied.", "title": "Requires Clean Graph", "type": "boolean" }, "requires_shape_prop": { "default": false, "description": "Whether this transform requires shape propagation before it is applied.", "title": "Requires Shape Prop", "type": "boolean" }, "debug_visualize_dir": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "description": "Debug visualization directory. None to disable visualization, or a path string to specify the output directory.", "title": "Debug Visualize Dir" }, "expect_mem_change": { "default": false, "description": "Whether this transform is expected to cause changes in CUDA memory stats.", "title": "Expect Mem Change", "type": "boolean" }, "device": { "default": "meta", "description": "The device to build the model on.", "title": "Device", "type": "string" } }, "$defs": { "Stages": { "description": "Enumerated (ordered!) stages of the transformation pipeline.\n\nThis is used to classify and pre-order transforms.", "enum": [ "factory", "export", "post_export", "pattern_matcher", "sharding", "weight_load", "post_load_fusion", "cache_init", "visualize", "compile" ], "title": "Stages", "type": "string" } }, "additionalProperties": true, "required": [ "stage" ] }
- Config:
extra: str = allow
- Fields:
device (str)
- field device: str = 'meta'
The device to build the model on.
Build And Load Factory Model#
Transform key: build_and_load_factory_model
Source module: tensorrt_llm._torch.auto_deploy.transform.library.build_model
Configured modes: transformers
- class tensorrt_llm._torch.auto_deploy.transform.library.build_model.BuildAndLoadFactoryModel(
- config: TransformConfig,
Bases:
BuildModelA simple wrapper transform to build AND load a model via the factory’s build_and_load API.
Under the hood, the factory can use a different way to build and load the model at the same time rather than just building the model. For example, the HF factory uses the .from_pretrained API to directly build and load the model at the same time.
We also assume that the build_and_load_model method will auto-shard the model appropriately.
YAML configuration
The fields below can be set under this transform’s entry in the AutoDeploy config YAML.
- pydantic model tensorrt_llm._torch.auto_deploy.transform.library.build_model.BuildModelConfig[source]
Bases:
TransformConfigConfiguration for the build model transform.
Show JSON schema
{ "title": "BuildModelConfig", "description": "Configuration for the build model transform.", "type": "object", "properties": { "stage": { "$ref": "#/$defs/Stages", "description": "The stage of the transformation pipeline where this transform should run." }, "run_per_gm": { "default": true, "description": "Whether to run the transform per graph (sub)module or on whole module.", "title": "Run Per Gm", "type": "boolean" }, "enabled": { "default": true, "description": "Whether to enable this transform.", "title": "Enabled", "type": "boolean" }, "skip_on_error": { "default": false, "description": "Whether to skip the transform if an error occurs.", "title": "Skip On Error", "type": "boolean" }, "run_graph_cleanup": { "default": true, "description": "Whether to run graph cleanup/canonicalization after this transform.", "title": "Run Graph Cleanup", "type": "boolean" }, "run_shape_prop": { "default": false, "description": "Whether to run shape propagation after this transform.", "title": "Run Shape Prop", "type": "boolean" }, "requires_clean_graph": { "default": true, "description": "Whether this transform requires the graph to be clean before it is applied.", "title": "Requires Clean Graph", "type": "boolean" }, "requires_shape_prop": { "default": false, "description": "Whether this transform requires shape propagation before it is applied.", "title": "Requires Shape Prop", "type": "boolean" }, "debug_visualize_dir": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "description": "Debug visualization directory. None to disable visualization, or a path string to specify the output directory.", "title": "Debug Visualize Dir" }, "expect_mem_change": { "default": false, "description": "Whether this transform is expected to cause changes in CUDA memory stats.", "title": "Expect Mem Change", "type": "boolean" }, "device": { "default": "meta", "description": "The device to build the model on.", "title": "Device", "type": "string" } }, "$defs": { "Stages": { "description": "Enumerated (ordered!) stages of the transformation pipeline.\n\nThis is used to classify and pre-order transforms.", "enum": [ "factory", "export", "post_export", "pattern_matcher", "sharding", "weight_load", "post_load_fusion", "cache_init", "visualize", "compile" ], "title": "Stages", "type": "string" } }, "additionalProperties": true, "required": [ "stage" ] }
- Config:
extra: str = allow
- Fields:
debug_visualize_dir (Optional[str])device (str)enabled (bool)expect_mem_change (bool)requires_clean_graph (bool)requires_shape_prop (bool)run_graph_cleanup (bool)run_per_gm (bool)run_shape_prop (bool)skip_on_error (bool)stage (Stages)
- field debug_visualize_dir: str | None = None
Debug visualization directory. None to disable visualization, or a path string to specify the output directory.
- field device: str = 'meta'
The device to build the model on.
- field enabled: bool = True
Whether to enable this transform.
- field expect_mem_change: bool = False
Whether this transform is expected to cause changes in CUDA memory stats.
- field requires_clean_graph: bool = True
Whether this transform requires the graph to be clean before it is applied.
- field requires_shape_prop: bool = False
Whether this transform requires shape propagation before it is applied.
- field run_graph_cleanup: bool = True
Whether to run graph cleanup/canonicalization after this transform.
- field run_per_gm: bool = True
Whether to run the transform per graph (sub)module or on whole module.
- field run_shape_prop: bool = False
Whether to run shape propagation after this transform.
- field skip_on_error: bool = False
Whether to skip the transform if an error occurs.
- field stage: Stages [Required]
The stage of the transformation pipeline where this transform should run.