mastra/Core Concepts

RAG

Retrieval Augmented Generation for grounding AI responses in external knowledge

ragretrievalvectorembeddings

RAG (Retrieval Augmented Generation)

Combine retrieval with generation for more accurate, context-aware responses.

Topics

RAG Overview

RAG combines the power of large language models with retrieval from external knowledge bases.

How RAG Works

  1. Index - Documents are chunked and embedded
  2. Retrieve - Relevant chunks are found for a query
  3. Generate - LLM generates answer with retrieved context

Why RAG?

  • Access current information
  • Ground responses in facts
  • Reduce hallucinations
  • Cite sources

Basic Usage

import { RAG } from '@mastra/rag';

const rag = new RAG({
  vectorStore: myVectorStore,
  embedder: openAIEmbedder,
});

await rag.indexDocuments({
  documents: myDocuments,
  chunkSize: 500,
});

const result = await rag.query({
  question: 'What is Mastra?',
});

console.log(result.answer);
console.log(result.sources);

Chunking and Embedding

Prepare documents for RAG by chunking and embedding.

Chunking Strategies

Fixed Size

const chunks = await chunkText({
  text: document.content,
  chunkSize: 500,
  overlap: 50,
});

Semantic Chunking

const chunks = await chunkBySentence({
  text: document.content,
  maxSentencesPerChunk: 5,
  overlap: 1,
});

Document Structure

const chunks = await chunkByStructure({
  content: document.content,
  headings: document.headings, // Preserve structure
});

Embedding

import { embed } from '@mastra/rag';

const embeddings = await embed({
  texts: chunks,
  embedder: openAIEmbedder,
  model: 'text-embedding-3-small',
});

Indexing

await vectorStore.insert({
  texts: chunks,
  embeddings,
  metadata: chunks.map((_, i) => ({ chunkId: i })),
});

Vector Databases

Store and search embeddings efficiently.

Supported Databases

  • Pinecone - Managed vector database
  • Qdrant - Open-source vector search
  • Weaviate - Vector search engine
  • Chroma - Embeddings database
  • pgvector - Vector in PostgreSQL

Configuration

Pinecone

import { PineconeVectorStore } from '@mastra/rag';

const store = new PineconeVectorStore({
  apiKey: process.env.PINECONE_API_KEY,
  index: 'my-index',
});

Qdrant

import { QdrantVectorStore } from '@mastra/rag';

const store = new QdrantVectorStore({
  url: 'http://localhost:6333',
  collection: 'my-collection',
});

Querying

const results = await store.query({
  embedding: queryEmbedding,
  topK: 10,
  filter: { category: 'docs' },
});

Hybrid Search

const results = await store.query({
  embedding: queryEmbedding,
  text: queryText, // BM25 keyword search
  topK: 10,
  hybrid: true,
});

Retrieval

Retrieve relevant context for queries.

Retrieval Process

  1. Embed the query
  2. Search vector store
  3. Filter results
  4. Return top matches

Basic Retrieval

const results = await rag.retrieve({
  query: 'How do I deploy Mastra?',
  topK: 5,
});

Retrieval Options

Similarity Threshold

const results = await rag.retrieve({
  query,
  topK: 10,
  minSimilarity: 0.7,
});

Metadata Filtering

const results = await rag.retrieve({
  query,
  topK: 10,
  filter: {
    category: 'deployment',
    version: 'v1',
  },
});

Reranking

const results = await rag.retrieve({
  query,
  topK: 20,
  rerank: {
    model: 'cross-encoder',
    topK: 5,
  },
});

Context Assembly

const context = results.map(r => r.text).join('\n\n');
const answer = await agent.generate(
  `Context: ${context}\n\nQuestion: ${query}`
);

GraphRAG

Knowledge graph-enhanced retrieval for better context.

What is GraphRAG?

GraphRAG uses knowledge graphs to improve retrieval by understanding relationships between entities.

When to Use GraphRAG

  • Complex, interconnected data
  • Questions about relationships
  • Multi-hop reasoning
  • Structured domain knowledge

Creating a Knowledge Graph

import { GraphRAG } from '@mastra/rag';

const graphRag = new GraphRAG({
  vectorStore,
  kgStore, // Knowledge graph store
  embedder,
});

await graphRag.indexDocuments({
  documents: docs,
  extractEntities: true,
  extractRelationships: true,
});

Querying GraphRAG

const results = await graphRag.query({
  query: 'What companies are in the AI space?',
  mode: 'global', // or 'local'
});

Graph Query Modes

Local Query

Best for specific entity questions:

const results = await graphRag.query({
  query: 'Tell me about OpenAI',
  mode: 'local',
});

Global Query

Best for summarizing topics:

const results = await graphRag.query({
  query: 'What are the main themes in this corpus?',
  mode: 'global',
});