checkpoint_manager
Checkpoint manager for activation hook scoring with periodic saves and resume support.
Classes
Manages checkpointing for activation hook scoring with periodic saves. |
- class ScoringCheckpointManager
Bases:
objectManages checkpointing for activation hook scoring with periodic saves.
- __init__(checkpoint_dir, activation_hooks=None, checkpoint_interval=100)
Initialize checkpoint manager.
- Parameters:
checkpoint_dir (str) – Directory to save checkpoints
activation_hooks – Dictionary of activation hooks to manage
checkpoint_interval (int) – Save checkpoint every N batches
- finalize()
Mark scoring as completed.
- load_checkpoint()
Load existing checkpoint if available, including hook states.
- Returns:
Dict with checkpoint info or None if no checkpoint exists
- Return type:
dict[str, Any] | None
- load_hook_states(activation_hooks)
Load hook states from checkpoint files.
- Parameters:
activation_hooks – Hook objects to load states into
- Returns:
True if hook states were successfully loaded, False otherwise
- Return type:
bool
- save_checkpoint()
Save current checkpoint to disk (progress info only). Hook states are saved separately in update_progress.
- should_skip_batch(batch_idx)
Check if we should skip this batch (already processed in previous run).
- Parameters:
batch_idx (int)
- Return type:
bool
- update_progress(batch_idx, total_batches)
Update progress and potentially save checkpoint.
- Parameters:
batch_idx (int) – Current batch index
total_batches (int) – Total number of batches