unified_export_hf

Code that export quantized Hugging Face models for deployment.

Functions

Exports the torch model to unified checkpoint and saves to export_dir.

export_hf_checkpoint(model, dtype=None, export_dir='/tmp', save_modelopt_state=False)

Exports the torch model to unified checkpoint and saves to export_dir.

Parameters:

model (Module) – the torch model.
dtype (dtype | None) – the weights data type to export the unquantized layers or the default model data type if None.
export_dir (Path | str) – the target export path.
save_modelopt_state (bool) – whether to save the modelopt state_dict.