Plugin#

class tensorrt_llm.plugin.PluginConfig(_explicitly_disable_gemm_plugin: bool = False)[source]#

Bases: object

The config that manages plugin-related options.

There are two option categories: * Plugin options (typically with xxx_plugin naming). These options can be assigned with:

“float16”/”bfloat16”/”float32”/”int32”, which means the plugin is enabled with the specified precision; (Some plugins only support limited dtype, i.e., gemm_swiglu_plugin and low_latency_gemm_swiglu_plugin only supports fp8 now)

“auto”, which means the plugin is enabled with the precision of dtype field (the dtype field must be same to model dtype, i.e., the one in PretrainedConfig);

None, which means the plugin is disabled.

Other features. These options can be assigned with boolean:
- True, which means the plugin is enabled;
- False, which means the plugin is disabled.

Note: All the fields should use a prefix “_”; PluginConfigMeta will wrap each field as a property. This ensures the fields can only be assigned with allowed values.

to_legacy_setting()[source]#

Legacy setting means that all of the plugins and features are disabled, this is needed for the legacy build.py script, which will be migrated to the centralized building script tensorrt_llm/commands/build.py.

After the migration is done, this function may or may not be deleted.