BudgetVec is a multi-tenant vector search platform built on Cloudflare Workers. SPTAG indexing, RaBitQ quantization, and the Pufferfish 3-tier cache deliver sub-10ms queries at 32× compression.
┌──────────┐ ┌──────────────────┐ ┌─────────────┐
│ Client │────▶│ CF Worker (WASM) │────▶│ Pufferfish │
└──────────┘ │ RaBitQ + PQ │ │ 3-Tier Cache │
│ BM25 + ANN │ └──────┬──────┘
└──────────────────┘ │
▼
┌─────────────┐
│ R2 Storage │
│ $0.015/GB │
└─────────────┘
Trusted by developers building the next generation of AI applications
See how much you'd pay with BudgetVec compared to traditional vector databases.
~$28.32/mo on a comparable managed service
“BudgetVec gave us sub-10ms vector search at 1/10th the cost of Pinecone. The Cloudflare Workers architecture means zero cold starts.”
“The hybrid BM25 + vector search is exactly what we needed. One API call, perfectly ranked results combining semantic and keyword matching.”
“Multi-tenant isolation out of the box was a game-changer. Each of our customers gets their own namespace with complete data separation.”
Observed performance from real workloads. Limits are enforced per tenant plan.
| Metric | Observed | Limit |
|---|---|---|
| Max namespaces per tenant | 1,000+ | Plan-dependent |
| Max vectors per namespace | 1B+ | Plan-dependent |
| Max vector dimensions | 4,096 | 4,096 |
| Max batch size (upsert) | 1,000 rows | 1,000 rows |
| Max query top_k | 1,000 | 10,000 |
| Max attribute size | 64 KB | 100 KB |
| p50 query latency | <10ms | - |
| p99 query latency | <50ms | - |
| Write throughput | 10K rows/s | - |