ML//RAG//HyDE

2026-03-06

- Hypothetical Document Embeddings: instead of embedding the raw query, ask the LLM to generate a hypothetical answer, then embed that.

Hypothetical Document Embeddings: instead of embedding the raw query, ask the LLM to generate a hypothetical answer, then embed that.

The generated document is probably wrong, but its embedding is closer in vector space to the real answer than the query alone.

Improves recall especially for short or vague queries. The LLM "expands" the query into the document's semantic neighborhood.

Simple technique, big impact. No training: works with any embedding model and any LLM.