Metrics#

Otel and TimeMeasure Metrics#

The codebase uses OpenTelemetry for tracing and metrics. The following environment variables can be set to enable metrics:

export VIA_CTX_RAG_ENABLE_OTEL=true
export VIA_CTX_RAG_EXPORTER=otlp # or console
export VIA_CTX_RAG_OTEL_ENDPOINT=http://otel_collector:4318 # only used if VIA_CTX_RAG_EXPORTER is otlp

Traces capture TimeMeasure metrics which are used to monitor the execution time of the different components.

Example Span#

{
  "name": "GraphRetrieval/Neo4jRetriever",
  "context": {
    "trace_id": "0x0ddaa0e6800dd0f4172746f53a3fc12b",
    "span_id": "0xbf4c3dc3c9050e0e",
    "trace_state": "[]"
  },
  "kind": "SpanKind.INTERNAL",
  "parent_id": null,
  "start_time": "2025-04-09T05:37:28.633505Z",
  "end_time": "2025-04-09T05:37:28.752445Z",
  "status": {
    "status_code": "UNSET"
  },
  "attributes": {
    "span name": "GraphRetrieval/Neo4jRetriever",
    "execution_time_ms": 119.0345287322998
  },
  "events": [],
  "links": [],
  "resource": {
    "attributes": {
      "service.name": "vss-ctx-rag-default"
    },
    "schema_url": ""
  }
}

Important TimeMeasure Metrics#

Context Manager#

  • context_manager/reset: Time taken to reset the Context Manager process and clear pending requests.

  • context_manager/configure: Time taken to apply a new configuration to the Context Manager.

  • context_manager/add_doc: Time taken to enqueue a document into the Context Manager.

  • context_manager/aprocess_doc/total: Time taken to process a document across all registered functions.

  • context_manager/aprocess_doc/{func.name}: Time taken to process a document within a specific function.

  • context_manager/call-manager: Time taken to orchestrate a call request inside the worker process.

  • context_manager/call/pending_add_doc: Time taken to wait for in‑flight add_doc requests before a call.

  • context_manager/call: Time taken to execute all registered functions for a given state payload.

Vector Storage (Milvus / Elasticsearch)#

  • milvusdb/add caption: Time taken to add a single caption document to Milvus.

  • Milvus/AddSummries: Time taken to bulk‑ingest summary documents into Milvus after splitting.

  • elasticsearch/add caption: Time taken to add a single caption document to Elasticsearch.

  • Elasticsearch/AddSummaries: Time taken to bulk‑ingest summary documents into Elasticsearch after splitting.

Graph RAG — Extraction (Base)#

  • GraphRAG/aprocess-doc:: Time taken to prepare and batch documents for graph creation.

  • GraphRAG/aprocess-doc/graph-create:: Time taken to create graph structures for a specific batch index.

  • GraphRAG/Base/acreate_graph: Time taken to end‑to‑end graph extraction for a batch.

  • GraphRAG/Base/add_graph_documents_to_db: Time taken to persist extracted GraphDocuments via the DB tool.

  • GraphRAG/Base/create_relation_between_chunks: Time taken to create FIRST_CHUNK and NEXT_CHUNK relations and summary links.

  • GraphRAG/Base/update_embedding_chunks: Time taken to compute embeddings for chunk nodes and persist them.

  • GraphRAG/Base/FetchEntEmbd: Time taken to fetch entities requiring embeddings.

  • GraphRAG/Base/UpdateEmbdingBatch: Time taken to compute embeddings for a batch of entities.

  • GraphRAG/Base/FetchSummaryEmbd: Time taken to fetch summaries requiring embeddings.

  • GraphRAG/Base/merge_chunk_entity_relationships: Time taken to link chunks to entities (HAS_ENTITY) in bulk.

  • GraphRAG/Base/apost_process: Time taken to post‑process after all batches: create doc node, link chunks, embeddings, KNN, dedup.

  • GraphRAG/Acall/graph-extraction/postprocessing: Time taken to run post‑processing from the GraphRAG function wrapper after batching completes.

Graph RAG — Backend: Neo4j#

  • GraphRAG/Neo4j/add_graph_documents: Time taken to bulk add GraphDocuments to Neo4j.

  • GraphRAG/Neo4j/persist_chunk_data: Time taken to create chunk nodes, PART_OF links, FIRST_CHUNK/NEXT_CHUNK edges.

  • GraphRAG/Neo4j/persist_summary_relations: Time taken to create IN_SUMMARY/SUMMARY_OF edges for summaries.

  • GraphRAG/Neo4j/persist_chunk_embeddings: Time taken to persist vector embeddings on Chunk nodes.

  • GraphRAG/Neo4j/merge_chunk_entity_rels: Time taken to create HAS_ENTITY relations between chunks and entities.

  • GraphRAG/Neo4j/UpdateKNN: Time taken to build/update KNN relations between chunks from embeddings.

  • GraphRAG/Neo4j/fetch_summaries_for_embedding: Time taken to read Summary nodes missing embeddings.

  • GraphRAG/Neo4j/merge_duplicate_nodes: Time taken to merge duplicate entity nodes above a similarity threshold.

  • GraphRAG/Neo4j/persist_entity_embeddings: Time taken to persist vector embeddings on Entity nodes.

  • GraphRAG/Neo4j/persist_summary_embeddings: Time taken to persist vector embeddings on Summary nodes.

  • GraphExtraction/VectorIndex: Time taken to create the Chunk vector index (drop/create lifecycle).

  • GraphExtraction/FetchEntEmbd: Time taken to fetch nodes to embed during Neo4j ingestion.

  • GraphExtraction/UpdatEmbding: Time taken to compute embeddings for fetched nodes and prepare persistence.

Graph RAG — Backend: Arango / NetworkX#

  • NXGraphExtraction/VectorIndex: Time taken to create vector indexes for both Chunk and Entity collections.

  • NXGraphRAG/add_graph_documents: Time taken to add GraphDocuments into in‑memory NetworkX graph.

  • NXGraphRAG/persist_chunk_data: Time taken to persist Chunk nodes and PART_OF/FIRST_CHUNK/NEXT_CHUNK edges in NetworkX.

  • NXGraphRAG/persist_summary_relations: Time taken to create IN_SUMMARY and SUMMARY_OF edges in NetworkX.

  • NXGraphRAG/persist_chunk_embeddings: Time taken to persist embeddings on Chunk nodes in NetworkX.

  • NXGraphRAG/persist_chunk_entity_rels: Time taken to create HAS_ENTITY edges in NetworkX.

  • NXGraphRAG/UpdateKNN: Time taken to compute KNN over Chunk embeddings and add SIMILAR edges in NetworkX.

  • NXGraphRAG/fetch_entities_for_embedding: Time taken to read Entity nodes missing embeddings from NetworkX.

  • NXGraphRAG/persist_entity_embeddings: Time taken to persist embeddings on Entity nodes in NetworkX.

  • NXGraphRAG/fetch_summaries_for_embedding: Time taken to read Summary nodes missing embeddings from NetworkX.

  • NXGraphRAG/persist_summary_embeddings: Time taken to persist embeddings on Summary nodes in NetworkX.

  • GraphRAG/ArangoDB/merge_duplicate_nodes: Time taken to merge duplicate entity groups inside Arango collections.

Graph Retrieval#

  • GraphRetrieval/RetrieveDocuments: Time taken to run vector+graph retrieval and format results (Arango backend).

  • GraphRetrieval/GetResponse: Time taken to ask the LLM to answer using formatted docs (optionally with images).

  • GraphRetrieval/SummarizeChat: Time taken to summarize chat history and store a concise summary.

Planner & Advanced Retrieval#

  • Planner/call: Time taken to run the iterative planner agent across tool calls to answer a question.

  • AdvGraphRetrieval/retrieve_context: Time taken to analyze the question and retrieve relevant context iteratively.

  • AdvImgGraphRAG/call: Time taken to advanced Graph RAG call with optional image reasoning.

Vector RAG#

  • VectorRAG/aprocess-doc/metrics_dump: Time taken to dump VectorRAG metrics to JSON (when VIA_LOG_DIR is set).

  • VectorRAG/retrieval: Time taken to end‑to‑end vector‑only retrieval and response generation.

Summarization — Online (BatchSummarization)#

  • summ/aprocess_doc: Time taken to ingest a caption, assign batch info, and trigger per‑batch summarization when full.

  • summ/acall/batch-aggregation-summary: Time taken to aggregate batch summaries into a final summary.

  • OffBatSumm/CombindAgg: Time taken to combine partial batch summaries and re‑summarize (token‑safe path).

  • OffBatchSumm/Acall: Time taken to fetch batch texts from storage and aggregate across the requested range.

Summarization — Offline (OfflineBatchSummarization)#

  • OfflineBatchSumm/aprocess_doc: Time taken to accumulate docs for later batch processing.

  • OfflineBatchSumm/ProcessAccumulatedBatches: Time taken to process all full batches gathered so far.

  • OfflineBatchSumm/ProcessBatch_{batch.get_batch_index()}: Time taken to summarize a single batch and persist.

  • OfflineBatchSumm/acall/batch-aggregation-summary: Time taken to aggregate stored batch summaries into a final summary.

  • OfflineBatchSumm/Acall: Time taken to orchestrate offline summarization (fetch, aggregate, and output).

Summary Retriever#

  • summary_retriever/acall: Time taken to filter chunks by time/camera, then summarize with an LLM prompt.

Notifications#

  • notifier/llm_call: Time taken to run the LLM classifier over configured events for a document.

  • notifier/notify_call: Time taken to send notifications for detected events with metadata.

VLM Retrieval#

  • VLMRetrieval/retrieval: Time taken to retrieve captions, extract images, and answer using a vision‑capable LLM.

Foundation RAG#

  • FoundationRAG/aprocess-doc/metrics_dump: Time taken to dump FoundationRAG metrics to JSON (when VIA_LOG_DIR is set).

  • FoundationRAG/retrieval: Time taken to retrieve with NVIDIA RAG service (hybrid search + optional reranker) and respond.