Advanced RAG Patterns

Standard RAG (Retrieve → Generate) works for simple cases. But complex scenarios require advanced patterns: HyDE, parent-child chunking, agentic RAG, and graph RAG.

HyDE (Hypothetical Document Embeddings)

The problem: The query "What are best practices for API security?" has a different embedding than a document describing those best practices.

Solution

Generate a hypothetical answer and use its embedding for search:

def hyde_retrieve(question: str, retriever, llm):
    # 1. Generate hypothetical answer
    hypothetical = llm.invoke(
        f"Write a detailed answer to: {question}"
    )
    # 2. Use hypothetical answer's embedding for retrieval
    results = retriever.invoke(hypothetical.content)
    return results

When to Use HyDE?

Scenario	HyDE helpful?
Factual questions	No — direct search suffices
Conceptual questions	Yes — better semantic matching
Questions with technical terms	Yes — hypothetical answer contains related terms
Short, vague questions	Yes — HyDE expands query context

Parent-Child Chunking

The dilemma: Small chunks deliver precise retrieval results but too little context. Large chunks deliver context but imprecise results.

Solution: Two-Level Chunking

# Parent chunks: Large sections (2000 tokens)
parent_splitter = RecursiveCharacterTextSplitter(chunk_size=2000)
parent_chunks = parent_splitter.split_documents(docs)

# Child chunks: Small sections (400 tokens)
child_splitter = RecursiveCharacterTextSplitter(chunk_size=400)
child_chunks = []
for parent in parent_chunks:
    children = child_splitter.split_documents([parent])
    for child in children:
        child.metadata["parent_id"] = parent.metadata["id"]
    child_chunks.extend(children)

# Retrieval: Search over child chunks
# Context: Deliver parent chunk to LLM

Search over:  [Child 1] [Child 2] [Child 3]  ← Precise matches
                  ↓
Deliver to LLM:  [────── Parent Chunk ──────]  ← Full context

Agentic RAG

An agent dynamically decides on the retrieval strategy:

def agentic_rag(question: str):
    # Agent decides: Which data source? How many hops? Filters?
    plan = agent.plan(question)

    if plan.needs_structured_data:
        results = sql_retriever.invoke(plan.sql_query)
    elif plan.needs_multiple_sources:
        results = multi_source_retrieve(plan.sources, question)
    else:
        results = vector_retriever.invoke(question)

    if plan.needs_verification:
        results = verify_and_filter(results, question)

    return generate_answer(question, results)

Agentic RAG Workflow

Question → Agent (Planner)
              │
              ├── "Simple fact question" → Vector Retrieval → Answer
              ├── "SQL needed" → Text-to-SQL → DB Query → Answer
              ├── "Multiple sources" → Multi-Source Retrieval → Merge → Answer
              └── "Not enough info" → Web Search → Answer

Graph RAG

Graph RAG combines vector search with knowledge graphs for structured knowledge:

Documents → Entity Extraction → Knowledge Graph
                                      ↕
                               Vector Database

Query → Graph Traversal + Vector Search → Merged Context → LLM

Advantages of Graph RAG

Aspect	Standard RAG	Graph RAG
Relationships	Implicit in chunks	Explicit in graph
Multi-hop	Difficult	Natural (graph traversal)
Aggregation	LLM must summarize	Graph queries aggregate
Transparency	Chunks as source	Entities and relations as source

Multi-Index Strategies

Different data types in different indices:

indices = {
    "docs": vectorstore_docs,       # Documentation
    "code": vectorstore_code,       # Source code
    "tickets": vectorstore_tickets, # Support tickets
    "faq": vectorstore_faq          # FAQ
}

def smart_retrieve(question: str):
    # Router decides which indices are relevant
    relevant_indices = route_to_indices(question)
    results = []
    for index_name in relevant_indices:
        results.extend(indices[index_name].similarity_search(question, k=3))
    return rerank(results, question)

Practical tip: Only implement advanced patterns when standard RAG demonstrably isn't sufficient. Measure retrieval quality with metrics (see lesson 5) before introducing HyDE, Graph RAG, or agentic RAG. Each pattern increases complexity and cost.