gpt_oss_pruned_to_mxfp4
Create a HuggingFace checkpoint with MXFP4 MoE weights from the original gpt-oss-120b model.
This script: 1. Copies non-MoE weights from the student model (trained attention, embeddings, etc.) 2. Extracts MoE expert weights from the original gpt-oss-120b in MXFP4 format 3. Deduces expert mappings by comparing weights 4. Outputs a new pruned (heterogeneous) checkpoint with PACKED MXFP4 expert weights
Classes
|
Special type indicating an unconstrained type. |
|
Opens a safetensors lazily and returns tensors as asked |
|
Decorate an iterable object, returning an iterator which acts exactly like the original iterable, but prints a dynamically updating progressbar every time a value is requested. |
Functions
|
Convert the mxfp4 weights again, dequantizing and makes them compatible with the forward pass of GPT_OSS. |
|
Copy configuration files from student model and update config.json. |
|
Copy non-MoE weights from student model. |
|
Deduce which original experts match the student experts by comparing weights. |
|
Load all MoE-related tensors for a layer, potentially from multiple files. |
|
Load the original model's safetensors index. |
|
|
|
Process a single layer - loads tensors from potentially multiple files. |
|
Saves a dictionary of tensors into raw bytes in safetensors format. |