Model Optimizer

Getting Started

  • Overview
  • Installation
  • Quick Start: PTQ - PyTorch
  • Quick Start: PTQ - ONNX
  • Quick Start: PTQ - PyTorch to ONNX
  • Quick Start: PTQ - Windows
  • Quick Start: QAT
  • Quick Start: Pruning
  • Quick Start: Distillation
  • Quick Start: Speculative Decoding
  • Quick Start: Sparsity

Guides

  • Support Matrix
  • Quantization
  • Saving & Restoring
  • Pruning
  • Distillation
  • Speculative Decoding
  • Sparsity
  • NAS
  • AutoCast (ONNX)
  • Autotune (ONNX)

Deployment

  • TensorRT-LLM
  • Onnxruntime
  • Unified HuggingFace Checkpoint

Examples

  • All GitHub Examples

Reference

  • NVIDIA Model Optimizer Changelog
  • modelopt API
    • deploy
    • onnx
    • torch
      • distill
      • export
      • kernels
      • nas
      • opt
      • peft
      • prune
      • puzzletron
        • anymodel
        • build_library_and_stats
        • dataset
        • puzzletron
        • sewing_kit
        • tools
      • quantization
      • sparsity
      • speculative
      • trace
      • utils

Support

  • Contact us
  • FAQs
Model Optimizer
  • modelopt API
  • torch
  • puzzletron
  • sewing_kit
  • passage
  • View page source

passage

Modules

modelopt.torch.puzzletron.sewing_kit.passage.core

Previous Next

© Copyright 2023-2025, NVIDIA Corporation.

Built with Sphinx using a theme provided by Read the Docs.