Plugin

class tensorrt_llm.plugin.PluginConfig[source]

Bases: object

The config that manages plugin-related options.

There are two option categories: * Plugin options (typically with xxx_plugin naming). These options can be assigned with:

  • “float16”/”bfloat16”/”float32”/”int32”, which means the plugin is enabled with the specified precision; (Some plugins only support limited dtype, i.e., gemm_swiglu_plugin and low_latency_gemm_swiglu_plugin only supports fp8 now)

  • “auto”, which means the plugin is enabled with the precision of dtype field (the dtype field must be same to model dtype, i.e., the one in PretrainedConfig);

  • None, which means the plugin is disabled.

  • Other features. These options can be assigned with boolean:
    • True, which means the plugin is enabled;

    • False, which means the plugin is disabled.

Note: All the fields should use a prefix “_”; PluginConfigMeta will wrap each field as a property. This ensures the fields can only be assigned with allowed values.

to_legacy_setting()[source]

Legacy setting means that all of the plugins and features are disabled, this is needed for the legacy build.py script, which will be migrated to the centralized building script tensorrt_llm/commands/build.py.

After the migration is done, this function may or may not be deleted.