loss_mask
Model-specific recovery of the assistant loss mask.
The standard way to build an answer-only loss mask is
apply_chat_template(..., return_assistant_tokens_mask=True), which maps the
{% generation %} template span to tokens via char_to_token – and that is
only available on “fast” tokenizers. Some models ship only a slow/Python tokenizer
and cannot use this path.
This module is a small registry of per-model fallbacks that recover the mask
directly from token ids, keyed by a detect predicate. Data paths consult
get_loss_mask_recovery() and stay free of any single model’s chat-format
details.
Classes
A model-specific fallback for building the assistant loss mask. |
Functions
Return the first registered recovery whose |
|
Register a model-specific loss-mask recovery. |
- class LossMaskRecovery
Bases:
objectA model-specific fallback for building the assistant loss mask.
- Parameters:
name – Identifier for the target model family (for logging/debugging).
detect – Returns
Trueif this recovery applies to the given tokenizer.compute – Maps
(tokenizer, input_ids)to a(seq_len,)LongTensormask aligned toinput_ids(1 on tokens that should contribute to the loss, 0 otherwise).
- __init__(name, detect, compute)
- Parameters:
name (str)
detect (Callable[[object], bool])
compute (Callable[[object, Tensor], Tensor])
- Return type:
None
- compute: Callable[[object, Tensor], Tensor]
- detect: Callable[[object], bool]
- name: str
- get_loss_mask_recovery(tokenizer)
Return the first registered recovery whose
detectmatchestokenizer.- Return type:
LossMaskRecovery | None
- register_loss_mask_recovery(recovery)
Register a model-specific loss-mask recovery.
- Parameters:
recovery (LossMaskRecovery)
- Return type:
None