Lesson 4 of 6·11 min read

Advanced RAG Patterns

Standard RAG (Retrieve → Generate) works for simple cases. But complex scenarios require advanced patterns: HyDE, parent-child chunking, agentic RAG, and graph RAG.

HyDE (Hypothetical Document Embeddings)

The problem: The query "What are best practices for API security?" has a different embedding than a document describing those best practices.

Solution

Generate a hypothetical answer and use its embedding for search:

def hyde_retrieve(question: str, retriever, llm):
    # 1. Generate hypothetical answer
    hypothetical = llm.invoke(
        f"Write a detailed answer to: {question}"
    )
    # 2. Use hypothetical answer's embedding for retrieval
    results = retriever.invoke(hypothetical.content)
    return results

When to Use HyDE?

ScenarioHyDE helpful?
Factual questionsNo — direct search suffices
Conceptual questionsYes — better semantic matching
Questions with technical termsYes — hypothetical answer contains related terms
Short, vague questionsYes — HyDE expands query context

Parent-Child Chunking

The dilemma: Small chunks deliver precise retrieval results but too little context. Large chunks deliver context but imprecise results.

Solution: Two-Level Chunking

# Parent chunks: Large sections (2000 tokens)
parent_splitter = RecursiveCharacterTextSplitter(chunk_size=2000)
parent_chunks = parent_splitter.split_documents(docs)

# Child chunks: Small sections (400 tokens)
child_splitter = RecursiveCharacterTextSplitter(chunk_size=400)
child_chunks = []
for parent in parent_chunks:
    children = child_splitter.split_documents([parent])
    for child in children:
        child.metadata["parent_id"] = parent.metadata["id"]
    child_chunks.extend(children)

# Retrieval: Search over child chunks
# Context: Deliver parent chunk to LLM
Search over:  [Child 1] [Child 2] [Child 3]  ← Precise matches
                  ↓
Deliver to LLM:  [────── Parent Chunk ──────]  ← Full context

Agentic RAG

An agent dynamically decides on the retrieval strategy:

def agentic_rag(question: str):
    # Agent decides: Which data source? How many hops? Filters?
    plan = agent.plan(question)

    if plan.needs_structured_data:
        results = sql_retriever.invoke(plan.sql_query)
    elif plan.needs_multiple_sources:
        results = multi_source_retrieve(plan.sources, question)
    else:
        results = vector_retriever.invoke(question)

    if plan.needs_verification:
        results = verify_and_filter(results, question)

    return generate_answer(question, results)

Agentic RAG Workflow

Question → Agent (Planner)
              │
              ├── "Simple fact question" → Vector Retrieval → Answer
              ├── "SQL needed" → Text-to-SQL → DB Query → Answer
              ├── "Multiple sources" → Multi-Source Retrieval → Merge → Answer
              └── "Not enough info" → Web Search → Answer

Graph RAG

Graph RAG combines vector search with knowledge graphs for structured knowledge:

Documents → Entity Extraction → Knowledge Graph
                                      ↕
                               Vector Database

Query → Graph Traversal + Vector Search → Merged Context → LLM

Advantages of Graph RAG

AspectStandard RAGGraph RAG
RelationshipsImplicit in chunksExplicit in graph
Multi-hopDifficultNatural (graph traversal)
AggregationLLM must summarizeGraph queries aggregate
TransparencyChunks as sourceEntities and relations as source

Multi-Index Strategies

Different data types in different indices:

indices = {
    "docs": vectorstore_docs,       # Documentation
    "code": vectorstore_code,       # Source code
    "tickets": vectorstore_tickets, # Support tickets
    "faq": vectorstore_faq          # FAQ
}

def smart_retrieve(question: str):
    # Router decides which indices are relevant
    relevant_indices = route_to_indices(question)
    results = []
    for index_name in relevant_indices:
        results.extend(indices[index_name].similarity_search(question, k=3))
    return rerank(results, question)

Practical tip: Only implement advanced patterns when standard RAG demonstrably isn't sufficient. Measure retrieval quality with metrics (see lesson 5) before introducing HyDE, Graph RAG, or agentic RAG. Each pattern increases complexity and cost.