Models
Configure and use LLM providers including OpenAI, Anthropic, Google, and Mistral
Models
Mastra provides a unified interface for working with LLMs across multiple providers.
Topics
- Overview - Introduction to models
- Embeddings - Text embedding models
- Gateways - Model routing and load balancing
- Providers - LLM provider configuration
Models Overview
Mastra provides a unified interface for working with LLMs across multiple providers.
Supported Providers
- OpenAI (GPT-4, GPT-3.5)
- Anthropic (Claude)
- Google (Gemini)
- Mistral
- Azure OpenAI
- And more...
Basic Configuration
import { Mastra } from '@mastra/core';
export const mastra = new Mastra({
models: {
gpt4: openai('gpt-4'),
claude: anthropic('claude-3-opus'),
},
});
Using Models
const agent = mastra.getAgent('myAgent', {
model: mastra.models.gpt4,
});
Model Selection
Choose models based on:
- Task complexity - Simple tasks can use smaller models
- Latency requirements - Smaller models are faster
- Cost - Balance performance and cost
- Capabilities - Some models excel at specific tasks
Embeddings
Text embedding models for RAG and similarity search.
Supported Embedders
- OpenAI (text-embedding-3-small, text-embedding-3-large)
- Cohere
- Hugging Face
- Azure OpenAI
Configuration
import { embed } from '@mastra/models';
const embedder = embed.openai({
model: 'text-embedding-3-small',
dimensions: 1536,
});
Generating Embeddings
const embeddings = await embedder.embed({
texts: ['Hello world', 'How are you?'],
});
console.log(embeddings[0]); // [0.123, -0.456, ...]
console.log(embeddings[1]); // [0.789, -0.012, ...]
Batch Processing
const texts = await loadDocuments();
const batches = chunkArray(texts, 100);
for (const batch of batches) {
const embeddings = await embedder.embed({ texts: batch });
await store.insert(embeddings);
}
Dimension Sizes
| Model | Dimensions | |-------|------------| | text-embedding-3-small | 1536 (or 512, 256) | | text-embedding-3-large | 3072 (or 1024, 256) | | text-embedding-ada-002 | 1536 |
Model Providers
Configure connections to LLM providers.
Overview
Mastra supports multiple LLM providers. Each provider has its own configuration requirements.
OpenAI Provider
Configure OpenAI models.
Installation
npm install @openai/openai
Configuration
import { openai } from '@mastra/models';
const gpt4 = openai('gpt-4', {
apiKey: process.env.OPENAI_API_KEY,
});
Models
| Model | Description | Context | |-------|-------------|---------| | gpt-4o | Most capable, fast | 128k | | gpt-4-turbo | Fast, cost-effective | 128k | | gpt-4 | High intelligence | 8k | | gpt-3.5-turbo | Fast, affordable | 16k |
Usage
const response = await gpt4.generate({
prompt: 'Hello!',
maxTokens: 100,
});
Streaming
const stream = await gpt4.stream({
prompt: 'Tell me a story',
});
for await (const chunk of stream) {
process.stdout.write(chunk.text);
}
Anthropic Provider
Configure Anthropic Claude models.
Installation
npm install @anthropic-ai/sdk
Configuration
import { anthropic } from '@mastra/models';
const claude = anthropic('claude-3-opus', {
apiKey: process.env.ANTHROPIC_API_KEY,
});
Models
| Model | Description | Context | |-------|-------------|---------| | claude-opus-4 | Highest intelligence | 200k | | claude-sonnet-4 | Balanced | 200k | | claude-3-opus | High capability | 200k | | claude-3-sonnet | Balanced | 200k | | claude-3-haiku | Fast | 200k |
Usage
const response = await claude.generate({
prompt: 'Hello!',
maxTokens: 100,
});
System Prompts
Anthropic excels at system prompts:
const response = await claude.generate({
system: 'You are a helpful assistant.',
prompt: 'Hello!',
maxTokens: 100,
});
Google Provider
Configure Google Gemini models.
Installation
npm install @google/generative-ai
Configuration
import { google } from '@mastra/models';
const gemini = google('gemini-pro', {
apiKey: process.env.GOOGLE_API_KEY,
});
Models
| Model | Description | Context | |-------|-------------|---------| | gemini-2.5-pro | Most capable | 1M | | gemini-2.0-flash | Fast | 1M | | gemini-1.5-pro | High capability | 1M | | gemini-1.5-flash | Fast | 1M |
Usage
const response = await gemini.generate({
prompt: 'Hello!',
maxTokens: 100,
});
Multimodal
const response = await gemini.generate({
prompt: 'What is in this image?',
images: [imageBuffer],
});
Mistral Provider
Configure Mistral AI models.
Installation
npm install @mistralai/mistralai
Configuration
import { mistral } from '@mastra/models';
const mistralModel = mistral('mistral-large', {
apiKey: process.env.MISTRAL_API_KEY,
});
Models
| Model | Description | |-------|-------------| | mistral-large | Most capable | | mistral-medium | Balanced | | mistral-small | Fast |
Usage
const response = await mistralModel.generate({
prompt: 'Hello!',
maxTokens: 100,
});
Azure OpenAI Provider
Configure Azure OpenAI models.
Installation
npm install @azure/openai
Configuration
import { azure } from '@mastra/models';
const gpt4 = azure('gpt-4', {
endpoint: process.env.AZURE_OPENAI_ENDPOINT,
apiKey: process.env.AZURE_OPENAI_API_KEY,
apiVersion: '2024-02-01',
});
Deployment
Azure OpenAI uses deployments:
const gpt4 = azure('gpt-4', {
endpoint: process.env.AZURE_OPENAI_ENDPOINT,
deploymentName: 'gpt-4',
apiKey: process.env.AZURE_OPENAI_API_KEY,
});
Usage
const response = await gpt4.generate({
prompt: 'Hello!',
maxTokens: 100,
});
Streaming
const stream = await gpt4.stream({
prompt: 'Tell me a story',
});
for await (const chunk of stream) {
process.stdout.write(chunk.text);
}
Model Gateways
Model gateways provide routing, load balancing, and fallback capabilities.
Overview
Gateways let you use multiple models with automatic failover and load balancing.
OpenAI Gateway
Route requests to multiple OpenAI models with fallback.
Configuration
import { createGateway } from '@mastra/models';
const openaiGateway = createGateway({
provider: 'openai',
models: ['gpt-4', 'gpt-3.5-turbo'],
strategy: 'fallback', // or 'load-balance'
});
Fallback Strategy
If one model fails, the next is tried:
const gateway = createGateway({
provider: 'openai',
models: ['gpt-4', 'gpt-3.5-turbo'],
strategy: 'fallback',
});
Load Balancing
Distribute requests across models:
const gateway = createGateway({
provider: 'openai',
models: ['gpt-4', 'gpt-3.5-turbo'],
strategy: 'load-balance',
weights: [0.2, 0.8], // 20% to gpt-4, 80% to gpt-3.5
});
Usage
const response = await gateway.generate({
prompt: 'Hello',
});
Anthropic Gateway
Route requests to multiple Anthropic models with fallback.
Configuration
import { createGateway } from '@mastra/models';
const anthropicGateway = createGateway({
provider: 'anthropic',
models: ['claude-3-opus', 'claude-3-sonnet'],
strategy: 'fallback',
});
Fallback Strategy
const gateway = createGateway({
provider: 'anthropic',
models: ['claude-3-opus', 'claude-3-sonnet', 'claude-3-haiku'],
strategy: 'fallback',
});
Usage
const response = await gateway.generate({
prompt: 'Explain quantum computing',
maxTokens: 1024,
});