Pure keyword search misses synonyms. Pure semantic search misses exact matches. The fix is to run both and fuse the results. This approach is called hybrid search, and it consistently outperforms either method alone.
Here’s the full pipeline in one shot – BM25 for keywords, sentence-transformers for semantics, FAISS for fast vector lookup, and Reciprocal Rank Fusion (RRF) to merge the rankings.
| |
| |
That’s the full index. BM25 tokenizes and scores with term frequency. FAISS stores the dense vectors for nearest-neighbor lookup. Both are ready to query.
Querying Both Systems
Each search system returns ranked results. BM25 scores documents by term overlap. The vector index scores by embedding similarity. You need both because they catch different things.
| |
Try a query where the two systems disagree. Search for “database text search” – BM25 will favor documents with those exact words, while semantic search will also surface conceptually related results about Elasticsearch and vector search that don’t contain “database” or “text search” literally.
| |
BM25 will rank the Postgres full-text search and Elasticsearch documents highest because they contain “search” and related terms. Semantic search might also surface the FAISS similarity search doc because it understands the conceptual relationship, even without keyword overlap.
Reciprocal Rank Fusion
RRF is the simplest and most effective way to merge two ranked lists. The idea: each document gets a score based on its rank position in each list, and you sum the scores. Documents that rank well in both lists float to the top.
The formula is: RRF_score = sum(1 / (k + rank)) where k is a constant (typically 60) that prevents high-ranked documents from dominating too aggressively.
| |
Now wire it all together into one function.
| |
Documents that appear in both the BM25 and semantic top-10 get boosted. Documents that only one system finds still appear, but ranked lower. This is exactly the behavior you want.
Where Hybrid Beats Single-Method Search
Hybrid search wins in three specific scenarios.
Exact term matching with context. Query: “Postgres EXPLAIN ANALYZE”. BM25 nails this because those are exact terms in a document. Semantic search might rank it lower because the embedding space doesn’t distinguish between database-specific tool names as sharply. Hybrid gets the right answer because BM25 pulls it to the top and semantic doesn’t push it down.
Synonym and concept matching. Query: “neural network embeddings for text”. Pure BM25 struggles because none of the documents contain this exact phrase. Semantic search finds the word2vec and BERT documents because it understands the conceptual overlap. Hybrid surfaces these correctly.
Ambiguous queries. Query: “fast search”. BM25 matches any document with “search” in it with roughly equal scores. Semantic search understands the intent is about performance/speed and ranks FAISS and approximate nearest neighbor documents higher. Hybrid combines both signals for a better ranking.
| |
Tuning the Pipeline
The k parameter in RRF controls how much rank position matters. Lower k (like 10) makes the top-ranked result from each system dominate. Higher k (like 60, the standard) smooths things out and gives more weight to documents appearing in multiple lists.
You can also weight the systems differently by scaling the RRF contributions:
| |
For most use cases, equal weights with k=60 is the right starting point. Only tune weights if you have evaluation data showing one system consistently outperforms the other on your specific queries.
Common Errors and Fixes
ValueError: shapes (1,384) and (768,) not aligned when searching FAISS
You mixed embedding models. The index was built with one model dimension and you’re querying with another. Always use the same SentenceTransformer model for both indexing and querying. Check the dimension with index.d and your query embedding shape.
| |
IndexError: index out of range from BM25 with empty queries
BM25Okapi returns zero scores for all documents when the query has no matching tokens. The bm25_search function above handles this by filtering out zero-score results, but if you skip that check you’ll get empty arrays that break downstream indexing.
RuntimeError: expected a non-empty list of Tensors from sentence-transformers
You passed an empty list to model.encode(). Guard against empty inputs before encoding:
| |
BM25 returns identical scores for all documents
This usually means your tokenization is too aggressive or too lax. The basic .split() tokenizer works for English but doesn’t handle punctuation well. Use a proper tokenizer for production:
| |
FAISS search returns negative cosine similarity scores
You forgot to normalize your embeddings. IndexFlatIP computes inner product, which only equals cosine similarity when vectors are unit-length. Always pass normalize_embeddings=True to model.encode(), or normalize manually with faiss.normalize_L2(embeddings).
Related Guides
- How to Build a Sentiment-Aware Search Pipeline with Embeddings
- How to Build a Semantic Search Engine with Embeddings
- How to Build a Multilingual NLP Pipeline with Sentence Transformers
- How to Build a RAG Pipeline with Hugging Face Transformers v5
- How to Build a Hybrid RAG Pipeline with Qwen3 Embeddings and Qdrant in 2026
- How to Build a Text Embedding Pipeline with Sentence Transformers and FAISS
- How to Build a Text Summarization Pipeline with Sumy and Transformers
- How to Build an Abstractive Summarization Pipeline with PEGASUS
- How to Build an Emotion Detection Pipeline with GoEmotions and Transformers
- How to Build an Aspect-Based Sentiment Analysis Pipeline