mod openai_responses#

module openai_responses#

Built-in codec for the OpenAI Responses API.

Implements LlmCodec (request decode/encode) and LlmResponseCodec (response decode) for the OpenAI Responses API format.

The Responses API differs significantly from Chat Completions:

  • Response: Heterogeneous output array (message, function_call, reasoning) instead of choices[0].message.

  • Finish reason: Derived from status + incomplete_details.reason instead of finish_reason field.

  • Request: Uses input (string or array) instead of messages, and instructions (top-level) instead of system message.

  • Max tokens: max_output_tokens instead of max_tokens.

Structs and Unions

struct OpenAIResponsesCodec#

Built-in codec for the OpenAI Responses API.

Traits implemented

impl LlmResponseCodec for OpenAIResponsesCodec#
impl LlmCodec for OpenAIResponsesCodec#
struct OpenAIResponsesStreamingCodec#

Streaming counterpart to OpenAIResponsesCodec.

Replays the OpenAI Responses SSE event sequence into the same JSON shape the API returns for a non-streaming request ({id, model, status, output, usage, incomplete_details, ...}). Once finalized, the assembled JSON can be fed back through OpenAIResponsesCodec::decode_response to produce the canonical AnnotatedLlmResponse.

Strategy

The Responses API is a relatively forgiving streaming target because every event carries either the full response snapshot (response.created, response.in_progress, response.completed, response.failed, response.incomplete) or the final-state output item (response.output_item.done). We:

  1. Track the latest response snapshot — terminal events (completed/failed/incomplete) typically carry the complete state including output, so we prefer those when present.

  2. Track output items by output_indexoutput_item.done events deliver the final per-item state, used as a fallback when the terminal response.output is missing or empty.

  3. Per-token output_text.delta and function_call_arguments.delta events are ignored because their content is redelivered in the matching output_item.done event. Skipping deltas keeps the codec resilient to schema additions and avoids double-accumulation.

Internal state lives behind Arc<Mutex<...>> so the &self-produced collector and finalizer closures share access. Each instance is single-use because LlmFinalizerFn consumes the finalize step.

Implementations

impl OpenAIResponsesStreamingCodec#

Functions

fn new() -> Self#

Creates a fresh streaming codec with empty accumulator state.

Traits implemented

impl Default for OpenAIResponsesStreamingCodec#
impl super::streaming::StreamingCodec for OpenAIResponsesStreamingCodec#