mod telemetry#

module telemetry#

Cache performance telemetry types for the ACG system.

These types normalize provider-specific cache metrics (Anthropic cache_read_input_tokens/cache_creation_input_tokens, OpenAI cached_tokens) into a common schema for uniform measurement. Populated by provider-specific normalization logic in Phase 9.

Enums

enum CacheMissEvidence#

Typed evidence for a cache miss diagnosis.

PrefixMismatch#

Stable prefix diverged from the retained exemplar.

first_mismatch_span_id: String#

Span ID of the first mismatching stable block.

sequence_index: u32#

Zero-based sequence index of the mismatching block.

expected_hash_prefix: String#

Expected short SHA-256 hash prefix.

actual_hash_prefix: String#

Actual short SHA-256 hash prefix.

BelowMinimumThreshold#

Stable prefix is too short for provider cache reuse.

observed_prefix_tokens: u32#

Observed stable prefix tokens.

required_min_tokens: u32#

Required minimum tokens for cache reuse.

estimation_source: String#

Source of the token estimate.

RetentionExpired#

Stable prefix likely aged out of the provider retention window.

observed_gap_secs: f64#

Observed gap between requests with the same stable prefix.

retention_window_secs: f64#

Provider retention window in seconds.

provider_semantics: String#

Human-readable provider semantics summary.

Unknown#

Diagnosis could not be justified from the available facts.

missing_facts: Vec<String>#

List of facts that were unavailable at classification time.

enum CacheMissReason#

Reason why a cache miss occurred.

Covers 8 determinable reasons plus an extensible Other variant. Uses internally-tagged JSON representation ("reason" field) so each variant serializes as {"reason": "snake_case"} and the Other variant additionally carries a description field.

PrefixMismatch#

Prompt prefix didn’t match cached prefix.

BelowMinimumThreshold#

Stable prefix shorter than provider minimum for caching.

RetentionExpired#

Cached prefix retention window elapsed.

RoutingMismatch#

Request routed to different worker/pool.

Evicted#

Cache evicted due to capacity pressure.

UnsupportedFeature#

Backend/model doesn’t support caching.

ColdStart#

First request for this prefix (no prior cache entry).

Unknown#

Reason could not be determined from provider response.

Other#

Extensible escape hatch for reasons not yet in the enum.

description: String#

Human-readable description of the miss reason.

enum CacheTelemetryProvider#

Cache telemetry provider identity for canonical Usage normalization.

Anthropic#

Anthropic Messages cache telemetry semantics.

OpenAI#

OpenAI Chat/Responses cache telemetry semantics.

Implementations

impl CacheTelemetryProvider#

Functions

fn as_str(self) -> &'static str#

Returns the canonical provider string for serialized cache telemetry.

Structs and Unions

struct CacheHitRate#

Aggregated cache hit rate over a time window.

Used for dashboard metrics and trend analysis.

hit_rate: f64#

Hit rate in the range [0.0, 1.0].

sample_count: u32#

Number of requests in the measurement window.

window_duration_secs: f64#

Duration of the measurement window in seconds.

struct CacheMissDiagnosis#

Structured diagnosis for a cache miss.

summary: String#

Single-line bounded explanation of the miss.

recommendation: String#

Single actionable follow-up for the caller.

evidence: CacheMissEvidence#

Evidence supporting the diagnosis.

struct CacheRequestFacts#

Request-time facts used to classify a cache miss without leaking prompt text.

provider: String#

Canonical provider string associated with the request facts.

stable_prefix_length: usize#

Number of stable prefix blocks observed in the request.

stable_prefix_tokens: Option<u32>#

Token count for the stable prefix when it can be measured safely.

required_min_tokens: Option<u32>#

Minimum provider threshold required for cache reuse.

first_mismatch_span_id: Option<String>#

Span ID of the first stable block that mismatched the retained exemplar.

first_mismatch_sequence_index: Option<u32>#

Sequence index of the first mismatching stable block.

expected_hash_prefix: Option<String>#

Expected short SHA-256 hash prefix for the first mismatching block.

actual_hash_prefix: Option<String>#

Actual short SHA-256 hash prefix for the first mismatching block.

retention_window_secs: Option<f64>#

Active cache retention window in seconds when provider semantics expose one.

observed_gap_secs: Option<f64>#

Observed elapsed time since the same stable prefix was last seen.

missing_facts: Vec<String>#

Facts that were unavailable when the runtime attempted diagnosis.

struct CacheTelemetryEvent#

Per-call cache telemetry event.

Captures provider-agnostic cache metrics for a single LLM request. The agent_identity field cross-references the Phase 3 AgentIdentity type for per-agent grouping.

request_id: Uuid#

Request ID this telemetry pertains to.

agent_identity: AgentIdentity#

Identity of the agent that issued the request.

cache_read_tokens: u64#

Number of tokens served from cache.

cache_creation_tokens: u64#

Number of tokens written to cache.

total_prompt_tokens: u64#

Total prompt tokens (for hit rate calculation).

hit_rate: f64#

Computed cache hit rate [0.0, 1.0].

miss_reason: Option<CacheMissReason>#

Reason for cache miss, if applicable.

miss_diagnosis: Option<CacheMissDiagnosis>#

Structured miss diagnosis, when the miss can be justified safely.

provider: String#

Provider name (e.g., “anthropic”, “openai”).

timestamp: DateTime<Utc>#

When this telemetry was recorded.

Implementations

impl CacheTelemetryEvent#

Functions

fn compute_hit_rate(cache_read_tokens: u64, total_prompt_tokens: u64) -> f64#

Computes hit rate from token counts. Returns 0.0 if total_prompt_tokens is zero to avoid division by zero.

fn from_usage(request_id: Uuid, agent_identity: AgentIdentity, provider: CacheTelemetryProvider, usage: &Usage, timestamp: DateTime<Utc>, request_facts: Option<&CacheRequestFacts>) -> Option<Self>#

Builds a canonical cache telemetry event from normalized usage fields.

Returns None when the normalized usage payload does not contain prompt_tokens, because Phase 10 does not invent missing totals.