unified_export_megatron

Code that export quantized Megatron Core models for deployment.

Functions

export_mcore_gpt_to_hf

Export Megatron Core GPTModel to unified checkpoint and save to export_dir.

import_mcore_gpt_from_hf

Import GPTModel state_dict from supported HuggingFace pretrained model path.

export_mcore_gpt_to_hf(model, pretrained_model_name_or_path=None, export_extra_modules=False, dtype=torch.float16, export_dir='/tmp')

Export Megatron Core GPTModel to unified checkpoint and save to export_dir.

Parameters:
  • model (Module) – The Megatron Core GPTModel instance.

  • pretrained_model_name_or_path (str | PathLike | None) – Can be either: the model id of a pretrained model hosted inside a model repo on huggingface.co; or a directory containing model weights saved using [~PreTrainedModel.save_pretrained], e.g., ./my_model_directory/.

  • export_extra_modules (bool) – If True, export extra modules like medusa_heads, eagle_module, or mtp. Otherwise, only export the base model.

  • dtype (dtype) – The weights data type to export the unquantized layers.

  • export_dir (Path | str) – The target export path.

import_mcore_gpt_from_hf(model, pretrained_model_path, workspace_dir=None, dtype=torch.float16)

Import GPTModel state_dict from supported HuggingFace pretrained model path.

Parameters:
  • model (Module) – The Megatron Core GPTModel instance.

  • pretrained_model_path (str) – A path to a directory containing model weights saved using [~PreTrainedModel.save_pretrained], e.g., ./my_model_directory/.

  • dtype (dtype) – The weights data type to import.

  • workspace_dir (str | None)