mastra/Core Concepts
RAG
Retrieval Augmented Generation for grounding AI responses in external knowledge
ragretrievalvectorembeddings
RAG (Retrieval Augmented Generation)
Combine retrieval with generation for more accurate, context-aware responses.
Topics
- Overview - Introduction to RAG
- Chunking and Embedding - Prepare documents
- Vector Databases - Store and search embeddings
- Retrieval - Retrieve relevant context
- GraphRAG - Knowledge graph-enhanced retrieval
RAG Overview
RAG combines the power of large language models with retrieval from external knowledge bases.
How RAG Works
- Index - Documents are chunked and embedded
- Retrieve - Relevant chunks are found for a query
- Generate - LLM generates answer with retrieved context
Why RAG?
- Access current information
- Ground responses in facts
- Reduce hallucinations
- Cite sources
Basic Usage
import { RAG } from '@mastra/rag';
const rag = new RAG({
vectorStore: myVectorStore,
embedder: openAIEmbedder,
});
await rag.indexDocuments({
documents: myDocuments,
chunkSize: 500,
});
const result = await rag.query({
question: 'What is Mastra?',
});
console.log(result.answer);
console.log(result.sources);
Chunking and Embedding
Prepare documents for RAG by chunking and embedding.
Chunking Strategies
Fixed Size
const chunks = await chunkText({
text: document.content,
chunkSize: 500,
overlap: 50,
});
Semantic Chunking
const chunks = await chunkBySentence({
text: document.content,
maxSentencesPerChunk: 5,
overlap: 1,
});
Document Structure
const chunks = await chunkByStructure({
content: document.content,
headings: document.headings, // Preserve structure
});
Embedding
import { embed } from '@mastra/rag';
const embeddings = await embed({
texts: chunks,
embedder: openAIEmbedder,
model: 'text-embedding-3-small',
});
Indexing
await vectorStore.insert({
texts: chunks,
embeddings,
metadata: chunks.map((_, i) => ({ chunkId: i })),
});
Vector Databases
Store and search embeddings efficiently.
Supported Databases
- Pinecone - Managed vector database
- Qdrant - Open-source vector search
- Weaviate - Vector search engine
- Chroma - Embeddings database
- pgvector - Vector in PostgreSQL
Configuration
Pinecone
import { PineconeVectorStore } from '@mastra/rag';
const store = new PineconeVectorStore({
apiKey: process.env.PINECONE_API_KEY,
index: 'my-index',
});
Qdrant
import { QdrantVectorStore } from '@mastra/rag';
const store = new QdrantVectorStore({
url: 'http://localhost:6333',
collection: 'my-collection',
});
Querying
const results = await store.query({
embedding: queryEmbedding,
topK: 10,
filter: { category: 'docs' },
});
Hybrid Search
const results = await store.query({
embedding: queryEmbedding,
text: queryText, // BM25 keyword search
topK: 10,
hybrid: true,
});
Retrieval
Retrieve relevant context for queries.
Retrieval Process
- Embed the query
- Search vector store
- Filter results
- Return top matches
Basic Retrieval
const results = await rag.retrieve({
query: 'How do I deploy Mastra?',
topK: 5,
});
Retrieval Options
Similarity Threshold
const results = await rag.retrieve({
query,
topK: 10,
minSimilarity: 0.7,
});
Metadata Filtering
const results = await rag.retrieve({
query,
topK: 10,
filter: {
category: 'deployment',
version: 'v1',
},
});
Reranking
const results = await rag.retrieve({
query,
topK: 20,
rerank: {
model: 'cross-encoder',
topK: 5,
},
});
Context Assembly
const context = results.map(r => r.text).join('\n\n');
const answer = await agent.generate(
`Context: ${context}\n\nQuestion: ${query}`
);
GraphRAG
Knowledge graph-enhanced retrieval for better context.
What is GraphRAG?
GraphRAG uses knowledge graphs to improve retrieval by understanding relationships between entities.
When to Use GraphRAG
- Complex, interconnected data
- Questions about relationships
- Multi-hop reasoning
- Structured domain knowledge
Creating a Knowledge Graph
import { GraphRAG } from '@mastra/rag';
const graphRag = new GraphRAG({
vectorStore,
kgStore, // Knowledge graph store
embedder,
});
await graphRag.indexDocuments({
documents: docs,
extractEntities: true,
extractRelationships: true,
});
Querying GraphRAG
const results = await graphRag.query({
query: 'What companies are in the AI space?',
mode: 'global', // or 'local'
});
Graph Query Modes
Local Query
Best for specific entity questions:
const results = await graphRag.query({
query: 'Tell me about OpenAI',
mode: 'local',
});
Global Query
Best for summarizing topics:
const results = await graphRag.query({
query: 'What are the main themes in this corpus?',
mode: 'global',
});