The only vector database that combines 8 search signals in a single API call — with confidence scoring that tells your LLM when to say "I don't know."
Every FluxVector query runs 8 independent search signals in parallel, then combines them with proprietary fusion scoring. The result: higher relevance than any single-signal approach.
| Pipeline Stage | Time | What It Does |
|---|---|---|
| Cache check | <1ms | Return cached result if repeat query |
| Multi-model embedding | ~45ms | Parallel inference across multiple models |
| 8-signal retrieval | ~25ms | 8 signals combined in parallel |
| Score fusion | <1ms | Proprietary score-aware fusion |
| Re-ranking | ~3ms | Neural reranking on top candidates |
| Confidence scoring | <1ms | Anti-hallucination confidence (0.0 – 1.0) |
Total warm: ~76ms. Batched (10+ concurrent): ~10ms/query. Cached: <1ms.
FluxVector's ~76ms includes embedding + 8-signal search + reranking + confidence. Competitors require a separate OpenAI call and offer 1–2 signals.
* Competitors don't embed text — add ~100–200ms for OpenAI embedding call. And they use 1–2 signals, not 8.
| Concurrent Users | Total Time | Per-User Latency |
|---|---|---|
| 1 | ~76ms | ~76ms |
| 10 | ~100ms | ~10ms |
| 100 | ~120ms | ~1.2ms |
Batch embedding is 5x more efficient. The more concurrent users, the faster each query gets. Inverse scaling.
| Signals | Built-in Embed | Reranker | Fine-grained | Fusion | Confidence | Anti-Hallucination | |
|---|---|---|---|---|---|---|---|
| FluxVector | 8 | Yes | Yes (2.9ms) | Yes | Yes | Yes | Yes |
| Pinecone | 1 | No | No | No | No | No | No |
| Qdrant | 1–2 | No | No | No | No | No | No |
| Weaviate | 1–2 | Partial | No | No | No | No | No |
Every HyperSearch response includes a confidence score (0.0 – 1.0). When confidence drops below 0.3, the response includes "warning": "low_relevance".
| Confidence | Meaning | RAG Action |
|---|---|---|
| 0.8 – 1.0 | Strong match across multiple signals | Use results with high trust |
| 0.3 – 0.8 | Moderate match, some signals weak | Use results, note uncertainty |
| 0.0 – 0.3 | Poor match — warning: "low_relevance" |
Tell user "I don't have reliable info" |
No other vector DB tells you when your retrieval failed. This is the missing piece for reliable RAG.
| FluxVector | Pinecone | Qdrant Cloud | |
|---|---|---|---|
| Vector DB | $29/mo | $70/mo | $65/mo |
| Embedding API (OpenAI) | $0 | ~$15/mo | ~$15/mo |
| Reranker API (Cohere) | $0 (built-in) | ~$10/mo | ~$10/mo |
| Total | $29/mo | ~$95/mo | ~$90/mo |
FluxVector embeds, reranks, and scores confidence on its own hardware. No per-token charges.
| Server | Hetzner Dedicated (AMD dedicated CPU, 64GB RAM, NVMe SSD) |
| Dataset | 100 documents, varied topics |
| Models | Multiple proprietary models, optimized for quality and speed |
| Search mode | Fusion (8-signal HyperSearch, score-aware fusion) |
| Cache | Redis (60s TTL). Cold = first query. Cached = repeat within 60s. |
| Location | Client in Europe → Server in Germany |
| Competitor data | Published docs + community benchmarks (single-signal query only) |
| Date | April 2026 |
Run your own benchmarks: pip install fluxvector
Free tier: 10K vectors, no credit card required. 8 signals on every query.
Get API Key