Vector Search
BudgetVec supports approximate nearest neighbor (ANN) search using SPTAG indexing with PQ-based asymmetric distance computation.
How It Works
1. Ingestion: Vectors are PQ-encoded and written to WAL entries in R2
2. Index Building: A background process builds SPTAG navigation graphs from WAL data
3. Query: The query vector is compared against PQ codebooks using ADC for fast approximate distances
4. Ranking: Top-K results are returned sorted by distance
Example
const ns = client.namespace("embeddings");
// Generate embedding from your model (e.g., OpenAI, Cohere, etc.)
const queryVector = await generateEmbedding("What is machine learning?");
const results = await ns.query({
rank_by: ["vector", "ANN", queryVector],
top_k: 10,
include_attributes: true,
});
for (const row of results.rows) {
console.log(Distance: ${row.dist}, Title: ${row.attributes.title});
}
Distance Metrics
- Cosine (default) — measures angle between vectors
- Euclidean — measures straight-line distance
Set the metric when creating the namespace or on the first upsert via the distance_metric field.
Performance Tips
- Use 768 dimensions for a good balance of quality and speed
- Batch upserts (up to 1,000 rows) for best write throughput
- Pre-filter with attribute filters to reduce the search space