loss_mask

Model-specific recovery of the assistant loss mask.

The standard way to build an answer-only loss mask is apply_chat_template(..., return_assistant_tokens_mask=True), which maps the {% generation %} template span to tokens via char_to_token – and that is only available on “fast” tokenizers. Some models ship only a slow/Python tokenizer and cannot use this path.

This module is a small registry of per-model fallbacks that recover the mask directly from token ids, keyed by a detect predicate. Data paths consult get_loss_mask_recovery() and stay free of any single model’s chat-format details.

Classes

LossMaskRecovery

A model-specific fallback for building the assistant loss mask.

Functions

`get_loss_mask_recovery`	Return the first registered recovery whose `detect` matches `tokenizer`.
`register_loss_mask_recovery`	Register a model-specific loss-mask recovery.

class LossMaskRecovery

Bases: object

A model-specific fallback for building the assistant loss mask.

Parameters:

name – Identifier for the target model family (for logging/debugging).
detect – Returns True if this recovery applies to the given tokenizer.
compute – Maps (tokenizer, input_ids) to a (seq_len,) LongTensor mask aligned to input_ids (1 on tokens that should contribute to the loss, 0 otherwise).

__init__(name, detect, compute)

Parameters:

name (str)
detect (Callable[[object], bool])
compute (Callable[[object, Tensor], Tensor])

Return type:

None

compute: Callable[[object, Tensor], Tensor]

detect: Callable[[object], bool]

name: str

get_loss_mask_recovery(tokenizer)

Return the first registered recovery whose detect matches tokenizer.

Return type:: LossMaskRecovery | None

register_loss_mask_recovery(recovery)

Register a model-specific loss-mask recovery.

Parameters:: recovery (LossMaskRecovery)
Return type:: None