Evaluate on your data

Retrieval and ingestion performance depend on your documents, hardware, and pipeline settings. Use the following when measuring quality and throughput on your datasets.

Benchmarking and baselines

Use this page as the baseline for methodology and expectations. Use Operational tuning below to observe production-like runs.

Throughput and dataset effects

Read Throughput is dataset-dependent for why raw numbers from generic benchmarks may not match your corpus (layout complexity, file types, image density, and so on).

Operational tuning

Ray and distributed ingest
Pre-Requisites & Support Matrix for supported configurations
Troubleshoot when results or performance diverge from expectations