loss_mask

Model-specific recovery of the assistant loss mask.

The standard way to build an answer-only loss mask is apply_chat_template(..., return_assistant_tokens_mask=True), which maps the {% generation %} template span to tokens via char_to_token – and that is only available on “fast” tokenizers. Some models ship only a slow/Python tokenizer and cannot use this path.

This module is a small registry of per-model fallbacks that recover the mask directly from token ids, keyed by a detect predicate. Data paths consult get_loss_mask_recovery() and stay free of any single model’s chat-format details.

Classes

LossMaskRecovery

A model-specific fallback for building the assistant loss mask.

Functions

get_loss_mask_recovery

Return the first registered recovery whose detect matches tokenizer.

register_loss_mask_recovery

Register a model-specific loss-mask recovery.

class LossMaskRecovery

Bases: object

A model-specific fallback for building the assistant loss mask.

Parameters:
  • name – Identifier for the target model family (for logging/debugging).

  • detect – Returns True if this recovery applies to the given tokenizer.

  • compute – Maps (tokenizer, input_ids) to a (seq_len,) LongTensor mask aligned to input_ids (1 on tokens that should contribute to the loss, 0 otherwise).

__init__(name, detect, compute)
Parameters:
  • name (str)

  • detect (Callable[[object], bool])

  • compute (Callable[[object, Tensor], Tensor])

Return type:

None

compute: Callable[[object, Tensor], Tensor]
detect: Callable[[object], bool]
name: str
get_loss_mask_recovery(tokenizer)

Return the first registered recovery whose detect matches tokenizer.

Return type:

LossMaskRecovery | None

register_loss_mask_recovery(recovery)

Register a model-specific loss-mask recovery.

Parameters:

recovery (LossMaskRecovery)

Return type:

None