gpt_oss_pruned_to_mxfp4

Create a HuggingFace checkpoint with MXFP4 MoE weights from the original gpt-oss-120b model.

This script: 1. Copies non-MoE weights from the student model (trained attention, embeddings, etc.) 2. Extracts MoE expert weights from the original gpt-oss-120b in MXFP4 format 3. Deduces expert mappings by comparing weights 4. Outputs a new pruned (heterogeneous) checkpoint with PACKED MXFP4 expert weights

Classes

`Any`	Special type indicating an unconstrained type.
`safe_open`	Opens a safetensors lazily and returns tensors as asked
`tqdm`	Decorate an iterable object, returning an iterator which acts exactly like the original iterable, but prints a dynamically updating progressbar every time a value is requested.

Functions

`convert_moe_packed_tensors`	Convert the mxfp4 weights again, dequantizing and makes them compatible with the forward pass of GPT_OSS.
`copy_config_files`	Copy configuration files from student model and update config.json.
`copy_non_moe_weights`	Copy non-MoE weights from student model.
`deduce_experts_for_layer`	Deduce which original experts match the student experts by comparing weights.
`load_layer_tensors`	Load all MoE-related tensors for a layer, potentially from multiple files.
`load_original_index`	Load the original model's safetensors index.
`main`
`process_single_layer`	Process a single layer - loads tensors from potentially multiple files.
`save_file`	Saves a dictionary of tensors into raw bytes in safetensors format.