Vector Search

BudgetVec supports approximate nearest neighbor (ANN) search using SPTAG indexing with PQ-based asymmetric distance computation.

How It Works

1. Ingestion: Vectors are PQ-encoded and written to WAL entries in R2

2. Index Building: A background process builds SPTAG navigation graphs from WAL data

3. Query: The query vector is compared against PQ codebooks using ADC for fast approximate distances

4. Ranking: Top-K results are returned sorted by distance

Example

const ns = client.namespace("embeddings");

// Generate embedding from your model (e.g., OpenAI, Cohere, etc.)
const queryVector = await generateEmbedding("What is machine learning?");

const results = await ns.query({
  rank_by: ["vector", "ANN", queryVector],
  top_k: 10,
  include_attributes: true,
});

for (const row of results.rows) {
  console.log(Distance: ${row.dist}, Title: ${row.attributes.title});
}

Distance Metrics

Cosine (default) — measures angle between vectors
Euclidean — measures straight-line distance

Set the metric when creating the namespace or on the first upsert via the distance_metric field.

Performance Tips

Use 768 dimensions for a good balance of quality and speed
Batch upserts (up to 1,000 rows) for best write throughput
Pre-filter with attribute filters to reduce the search space