Export Stage#

Export converts the model into a graph representation that later stages can inspect and rewrite. After this point, transforms operate on graph structure rather than only on the original model object.

Export To Gm#

Transform key: export_to_gm

Source module: tensorrt_llm._torch.auto_deploy.transform.library.export_to_gm

Configured modes: graph

class tensorrt_llm._torch.auto_deploy.transform.library.export_to_gm.ExportToGM( config: TransformConfig, )[source]#

Bases: BaseTransform

A simple wrapper transform to export a model to a graph module.

classmethod get_config_class() → Type[TransformConfig][source]#

Get the configuration class for the transform.

This is used to validate the configuration of the transform.

YAML configuration

The fields below can be set under this transform’s entry in the AutoDeploy config YAML.

pydantic model tensorrt_llm._torch.auto_deploy.transform.library.export_to_gm.ExportToGMConfig[source]

Bases: TransformConfig

Configuration for the export to graph module transform.

Show JSON schema

{
   "title": "ExportToGMConfig",
   "description": "Configuration for the export to graph module transform.",
   "type": "object",
   "properties": {
      "stage": {
         "$ref": "#/$defs/Stages",
         "description": "The stage of the transformation pipeline where this transform should run."
      },
      "run_per_gm": {
         "default": true,
         "description": "Whether to run the transform per graph (sub)module or on whole module.",
         "title": "Run Per Gm",
         "type": "boolean"
      },
      "enabled": {
         "default": true,
         "description": "Whether to enable this transform.",
         "title": "Enabled",
         "type": "boolean"
      },
      "skip_on_error": {
         "default": false,
         "description": "Whether to skip the transform if an error occurs.",
         "title": "Skip On Error",
         "type": "boolean"
      },
      "run_graph_cleanup": {
         "default": true,
         "description": "Whether to run graph cleanup/canonicalization after this transform.",
         "title": "Run Graph Cleanup",
         "type": "boolean"
      },
      "run_shape_prop": {
         "default": false,
         "description": "Whether to run shape propagation after this transform.",
         "title": "Run Shape Prop",
         "type": "boolean"
      },
      "requires_clean_graph": {
         "default": true,
         "description": "Whether this transform requires the graph to be clean before it is applied.",
         "title": "Requires Clean Graph",
         "type": "boolean"
      },
      "requires_shape_prop": {
         "default": false,
         "description": "Whether this transform requires shape propagation before it is applied.",
         "title": "Requires Shape Prop",
         "type": "boolean"
      },
      "debug_visualize_dir": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Debug visualization directory. None to disable visualization, or a path string to specify the output directory.",
         "title": "Debug Visualize Dir"
      },
      "expect_mem_change": {
         "default": false,
         "description": "Whether this transform is expected to cause changes in CUDA memory stats.",
         "title": "Expect Mem Change",
         "type": "boolean"
      },
      "strict": {
         "default": false,
         "description": "Whether to export in strict mode. NOTE: we generally export in non-strict modefor now as it relaxes some assumptions around tracing. Strict mode uses torchdynamo(symbolic bytecode analysis), which can be brittle since it relies on the exact bytecoderepresentation of the model see here as well: https://pytorch.org/docs/stable/export.html#non-strict-export",
         "title": "Strict",
         "type": "boolean"
      },
      "clone_state_dict": {
         "default": false,
         "description": "Whether to clone the state_dict of the model. This is useful to avoidmodifying the original state_dict of the model.",
         "title": "Clone State Dict",
         "type": "boolean"
      },
      "patch_list": {
         "anyOf": [
            {
               "items": {
                  "type": "string"
               },
               "type": "array"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "List of patch names to apply with export. Default is to apply all registered patches.",
         "title": "Patch List"
      },
      "num_moe_experts_for_export": {
         "anyOf": [
            {
               "type": "integer"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "If set, only this many MOE experts are traced during torch.export, and the graph is expanded to include all experts afterwards. This can dramatically speed up export for large MOE models (e.g. 256 experts). Recommended value: 2.",
         "title": "Num Moe Experts For Export"
      }
   },
   "$defs": {
      "Stages": {
         "description": "Enumerated (ordered!) stages of the transformation pipeline.\n\nThis is used to classify and pre-order transforms.",
         "enum": [
            "factory",
            "export",
            "post_export",
            "pattern_matcher",
            "sharding",
            "weight_load",
            "post_load_fusion",
            "cache_init",
            "visualize",
            "compile"
         ],
         "title": "Stages",
         "type": "string"
      }
   },
   "additionalProperties": true,
   "required": [
      "stage"
   ]
}

Config:

extra: str = allow

Fields:

clone_state_dict (bool)
num_moe_experts_for_export (int | None)
patch_list (List[str] | None)
strict (bool)

field clone_state_dict: bool = False: Whether to clone the state_dict of the model. This is useful to avoidmodifying the original state_dict of the model.

field num_moe_experts_for_export: int | None = None: If set, only this many MOE experts are traced during torch.export, and the graph is expanded to include all experts afterwards. This can dramatically speed up export for large MOE models (e.g. 256 experts). Recommended value: 2.

field patch_list: List[str] | None = None: List of patch names to apply with export. Default is to apply all registered patches.

field strict: bool = False: Whether to export in strict mode. NOTE: we generally export in non-strict modefor now as it relaxes some assumptions around tracing. Strict mode uses torchdynamo(symbolic bytecode analysis), which can be brittle since it relies on the exact bytecoderepresentation of the model see here as well: https://pytorch.org/docs/stable/export.html#non-strict-export