Hybrid Search
Hybrid search combines vector similarity (ANN) with full-text (BM25) ranking for the best of both worlds.
Why Hybrid?
- Vector search excels at semantic similarity ("What is ML?" matches "machine learning tutorial")
- BM25 excels at exact keyword matching ("error code 404" matches documents containing that exact phrase)
- Hybrid combines both signals for higher recall and precision
Example
// Multi-query: array of query objects
const results = await ns.query([
{
rank_by: ["vector", "ANN", queryVector],
top_k: 20,
},
{
rank_by: ["text", "BM25", "machine learning basics"],
top_k: 20,
},
]);
// results is an array of two result sets
const [vectorResults, textResults] = results;
The API returns separate result sets for each sub-query. Client-side fusion (e.g., Reciprocal Rank Fusion) can combine them into a single ranked list.
Reciprocal Rank Fusion (RRF)
A simple approach to merge results:
function rrf(resultSets: any[][], k = 60) {
const scores = new Map<number, number>();
for (const results of resultSets) {
results.forEach((row, rank) => {
const prev = scores.get(row.id) || 0;
scores.set(row.id, prev + 1 / (k + rank + 1));
});
}
return [...scores.entries()]
.sort((a, b) => b[1] - a[1])
.map(([id, score]) => ({ id, score }));
}