Complete collection of examples demonstrating cascadeflow from basics to production deployment.
# 1. Install cascadeflow
npm install @cascadeflow/core
# 2. Set your API key
export OPENAI_API_KEY="sk-..."
# 3. Run your first example
cd packages/core/examples/nodejs
npx tsx basic-usage.tsThat's it! You'll see cascading in action with cost savings.
| Example | What It Does | Complexity | Time | Best For |
|---|---|---|---|---|
| basic-usage.ts | Learn cascading basics | ⭐ Easy | 5 min | First-time users |
| tool-calling.ts | Function calling | ⭐⭐ Medium | 15 min | Agent builders |
| cost-tracking.ts | Budget management | ⭐⭐ Medium | 15 min | Cost optimization |
| multi-provider.ts | Mix AI providers | ⭐⭐ Medium | 10 min | Multi-cloud |
| reasoning-models.ts | o1, o3, Claude 3.7, DeepSeek-R1 | ⭐⭐ Medium | 10 min | Complex reasoning |
| semantic-quality.ts | ML-based quality validation | ⭐⭐⭐ Advanced | 15 min | Quality assurance |
| production-patterns.ts | Enterprise patterns | ⭐⭐⭐ Advanced | 30 min | Production deployment |
💡 Tip: Start with basic-usage.ts, then explore based on your use case!
I want to...
- Use tools/functions? →
tool-calling.ts - Track costs? →
cost-tracking.ts - Enforce budgets? →
cost-tracking.ts,production-patterns.ts - Use multiple providers? →
multi-provider.ts - Deploy to production? →
production-patterns.ts - Use reasoning models? →
reasoning-models.ts - Validate quality with ML? →
semantic-quality.ts - Access DeepSeek/Gemini/Azure? → Python examples (LiteLLM integration)
- 🌟 Core Examples - Basic usage, tools, multi-provider, reasoning
- 💰 Cost Management - Budget tracking
- 🤖 Quality & Validation - Semantic quality with ML
- 🏭 Production - Enterprise patterns
Perfect for learning cascadeflow basics. Start with these!
File: nodejs/basic-usage.ts
Time: 5 minutes
What you'll learn:
- How cascading works (cheap model → expensive model)
- Automatic quality-based routing
- Cost tracking and savings
- When drafts are accepted vs rejected
Run it:
export OPENAI_API_KEY="sk-..."
cd packages/core/examples/nodejs
npx tsx basic-usage.tsExpected output:
Query 1/8: What color is the sky?
💚 Model: gpt-4o-mini only
💰 Cost: $0.000081
✅ Draft Accepted
Query 5/8: Write a function to reverse a string...
💛 Model: gpt-4o
💰 Cost: $0.001320
🎯 Direct Route (hard complexity)
💰 TOTAL SAVINGS: 48.2% reduction
Key concepts:
- Token-based pricing (not flat rates)
- PreRouter detects complexity and routes accordingly
- Draft accepted = verifier skipped (saves money!)
- Semantic quality checking with embeddings
File: nodejs/tool-calling.ts
Time: 15 minutes
What you'll learn:
- Define tools with TypeScript types
- Type-safe tool definitions
- Tool execution across cascade tiers
- Universal tool format
Key features:
- Full TypeScript type safety
- Weather and calculator tools
- Automatic tool format conversion
- Cross-provider compatibility
Run it:
export OPENAI_API_KEY="sk-..."
npx tsx tool-calling.tsFile: nodejs/multi-provider.ts
Time: 10 minutes
What you'll learn:
- Mix models from different providers
- OpenAI + Anthropic + Groq
- Provider-specific configurations
- Cross-provider cost comparison
Example setup:
const agent = new CascadeAgent({
models: [
{ name: 'llama-3.1-8b-instant', provider: 'groq', cost: 0.00005 }, // Fast & cheap
{ name: 'gpt-4o', provider: 'openai', cost: 0.00625 }, // Quality
{ name: 'claude-3-5-sonnet', provider: 'anthropic', cost: 0.003 }, // Reasoning
],
});Requirements:
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GROQ_API_KEY="gsk_..."File: nodejs/reasoning-models.ts
Time: 10 minutes
What you'll learn:
- Use o1, o3-mini, Claude 3.7, DeepSeek-R1
- Extended thinking mode
- Chain-of-thought reasoning
- Zero configuration (auto-detects reasoning capabilities)
Supported models:
- OpenAI: o1, o1-mini, o3-mini
- Anthropic: claude-3-7-sonnet-20250219
- Ollama: deepseek-r1, deepseek-r1-distill (free local)
- vLLM: deepseek-r1 (self-hosted)
Example:
const agent = new CascadeAgent({
models: [
{ name: 'gpt-4o-mini', provider: 'openai', cost: 0.00015 },
{ name: 'o1', provider: 'openai', cost: 0.015 }, // Auto-detected
],
});
// Reasoning tokens automatically tracked
const result = await agent.run('Solve the Traveling Salesman Problem');Track costs and manage budgets in production.
File: nodejs/cost-tracking.ts
Time: 15 minutes
What you'll learn:
- Real-time cost tracking across queries
- Per-model and per-provider cost analysis
- Budget limits and alerts
- Cost history and trends
Features:
- Tracks costs per query, model, and provider
- Manual tracking implementation (TypeScript doesn't have telemetry module yet)
- Budget warnings at configurable thresholds
- Cost breakdown by complexity
Run it:
export OPENAI_API_KEY="sk-..."
npx tsx cost-tracking.tsOutput example:
📊 Cost Breakdown:
GPT-4o-mini: $0.000420 (5 queries)
GPT-4o: $0.004650 (3 queries)
Total: $0.005070
💰 Budget Status:
Used: $0.005070 / $10.00 (0.05%)
Remaining: $9.995
ML-based semantic validation using embeddings.
File: nodejs/semantic-quality.ts
Time: 15 minutes
What you'll learn:
- Semantic similarity scoring with BGE-small-en-v1.5
- Off-topic response detection
- Integration with cascade quality validation
- Request-scoped caching for performance
Features:
- Model: BGE-small-en-v1.5 (~40MB, auto-downloads)
- Runtime: CPU-based, fully local inference
- Latency: ~50-100ms per check (with caching)
- Caching: 50% latency reduction on cache hits
Installation:
npm install @cascadeflow/ml @huggingface/transformersExample:
import { CascadeAgent } from '@cascadeflow/core';
import { UnifiedEmbeddingService } from '@cascadeflow/ml';
const embeddingService = await UnifiedEmbeddingService.getInstance();
const agent = new CascadeAgent({
models: [
{ name: 'gpt-4o-mini', provider: 'openai', cost: 0.00015 },
{ name: 'gpt-4o', provider: 'openai', cost: 0.00625 },
],
quality: {
semanticThreshold: 0.7, // Reject if similarity < 70%
embeddingService,
},
});Run it:
export OPENAI_API_KEY="sk-..."
npx tsx semantic-quality.tsDeploy cascadeflow to production with enterprise patterns.
File: nodejs/production-patterns.ts
Time: 30 minutes
What you'll learn:
- Error handling and automatic retries
- Response caching for performance
- Rate limiting and throttling
- Monitoring and logging
- Cost tracking and budgets
- Failover strategies
Patterns covered:
- Exponential backoff retries
- In-memory and Redis caching
- Token bucket rate limiting
- Structured logging
- Budget enforcement
- Multi-provider fallback
Features:
// Error handling with retries
async function withRetry(fn, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try {
return await fn();
} catch (error) {
if (i === maxRetries - 1) throw error;
await sleep(Math.pow(2, i) * 1000); // Exponential backoff
}
}
}
// Response caching
class ResponseCache {
cache = new Map();
ttl = 3600; // 1 hour
get(key: string): string | null {
const entry = this.cache.get(key);
if (!entry || Date.now() > entry.expiry) return null;
return entry.value;
}
}
// Rate limiting
class RateLimiter {
tokens: number;
maxTokens: number;
refillRate: number; // tokens per second
}Run it:
export OPENAI_API_KEY="sk-..."
npx tsx production-patterns.ts- ✅ Run
basic-usage.ts- Understand core concepts - ✅ Read the code comments - Learn patterns
- ✅ Try different queries - See routing decisions
Key concepts:
- Cascading = cheap model first, escalate if needed
- Draft accepted = money saved ✅
- Draft rejected = quality ensured ✅
- PreRouter detects complexity before calling models
- ✅ Run
tool-calling.ts- Learn tool usage - ✅ Run
multi-provider.ts- Mix providers - ✅ Run
reasoning-models.ts- Try o1/o3
Key concepts:
- Type-safe tool definitions
- Universal tool format
- Cross-provider compatibility
- Reasoning token tracking
- ✅ Run
cost-tracking.ts- Learn budget tracking - ✅ Implement custom budget logic
- ✅ Read Cost Tracking Guide
Key concepts:
- Token-based pricing
- Per-model breakdown
- Budget alerts
- Cost optimization
- ✅ Run
production-patterns.ts- Enterprise patterns - ✅ Run
semantic-quality.ts- ML validation - ✅ Read Production Guide
Key concepts:
- Error handling
- Rate limiting
- Caching
- Monitoring
# Install cascadeflow
npm install @cascadeflow/core
# For semantic quality example
npm install @cascadeflow/ml @huggingface/transformers
# Install peer dependencies for providers you'll use
npm install openai # OpenAI
npm install @anthropic-ai/sdk # Anthropic
npm install groq-sdk # Groq# OpenAI (most examples)
export OPENAI_API_KEY="sk-..."
# Anthropic
export ANTHROPIC_API_KEY="sk-ant-..."
# Groq (free, fast)
export GROQ_API_KEY="gsk_..."
# Together AI
export TOGETHER_API_KEY="..."
# HuggingFace
export HF_TOKEN="hf_..."# Navigate to examples directory
cd packages/core/examples/nodejs
# Run with tsx (recommended)
npx tsx basic-usage.ts
npx tsx tool-calling.ts
npx tsx cost-tracking.ts
# Or install tsx globally
npm install -g tsx
tsx basic-usage.tsimport { CascadeAgent } from '@cascadeflow/core';
// Recommended: Claude Haiku + GPT-5
const agent = new CascadeAgent({
models: [
{ name: 'claude-haiku-4-5-20251001', provider: 'anthropic', cost: 0.0008 },
{ name: 'gpt-5', provider: 'openai', cost: 0.00125 },
],
});
const result = await agent.run('What is TypeScript?');
console.log(`Cost: $${result.totalCost}, Savings: ${result.savingsPercentage}%`);Note: GPT-5 availability depends on your OpenAI account tier. The cascade works immediately - Claude Haiku handles 75% of queries!
const agent = new CascadeAgent({
models: [
{ name: 'gpt-4o-mini', provider: 'openai', cost: 0.00015 },
{ name: 'gpt-4o', provider: 'openai', cost: 0.00625 },
],
});import { CascadeAgent, ModelConfig } from '@cascadeflow/core';
const models: ModelConfig[] = [
{
name: 'claude-haiku-4-5-20251001',
provider: 'anthropic',
cost: 0.0008,
apiKey: process.env.ANTHROPIC_API_KEY,
},
{
name: 'gpt-4o',
provider: 'openai',
cost: 0.00625,
apiKey: process.env.OPENAI_API_KEY,
},
];
const agent = new CascadeAgent({
models,
quality: {
threshold: 0.7, // Quality configured at agent level
requireMinimumTokens: 10,
},
});
const result = await agent.run('Explain quantum computing');
console.log(result.content);
console.log(`Cost: $${result.totalCost}, Saved: ${result.savingsPercentage}%`);All 7 providers work in both Node.js and browser:
- ✅ OpenAI
- ✅ Anthropic
- ✅ Groq
- ✅ Together AI
- ✅ Ollama (local)
- ✅ HuggingFace
- ✅ vLLM (local)
cascadeflow automatically detects your runtime environment!
Browser Example: See browser/vercel-edge/ for edge function deployment.
API key errors
# Check if set
echo $OPENAI_API_KEY
# Set it
export OPENAI_API_KEY="sk-..."
# Windows
set OPENAI_API_KEY=sk-...
# Or use .env file
echo "OPENAI_API_KEY=sk-..." > .envImport errors
# Install core package
npm install @cascadeflow/core
# Install peer dependencies
npm install openai @anthropic-ai/sdk groq-sdk
# For semantic quality
npm install @cascadeflow/ml @huggingface/transformersExamples run but show errors
# Check Node.js version (18+ required)
node --version
# Reinstall
rm -rf node_modules package-lock.json
npm installtsx command not found
# Install tsx
npm install -g tsx
# Or use npx
npx tsx basic-usage.tsSemantic quality model download fails
# Model downloads automatically on first run
# If it fails, check internet connection and try again
# Manual cache clear
rm -rf ~/.cache/huggingfaceBegin with basic-usage.ts before advanced examples.
All examples are heavily commented. Read through to understand patterns.
Token-Based Pricing:
- Input and output tokens priced differently
- gpt-4o: $0.0025 input, $0.010 output per 1K tokens
- Actual costs depend on query/response length
Cost Savings:
- Draft accepted = only cheap model used (big savings!)
- Draft rejected = both models used (quality ensured)
- Direct routing = only expensive model used (no savings)
Quality Validation:
- Logprobs-based (default)
- Semantic similarity with embeddings (optional)
- Custom validators supported
const result = await agent.run(query);
// Access result properties
console.log(`Model: ${result.modelUsed}`);
console.log(`Cost: $${result.totalCost}`);
console.log(`Savings: ${result.savingsPercentage}%`);
console.log(`Latency: ${result.latencyMs}ms`);
console.log(`Cascaded: ${result.cascaded}`);
console.log(`Draft Accepted: ${result.draftAccepted}`);// Track costs manually
const costs = {
total: 0,
byModel: {} as Record<string, number>,
queries: 0,
};
for (const query of queries) {
const result = await agent.run(query);
costs.total += result.totalCost;
costs.byModel[result.modelUsed] =
(costs.byModel[result.modelUsed] || 0) + result.totalCost;
costs.queries++;
}
console.log(`Average cost per query: $${(costs.total / costs.queries).toFixed(6)}`);- Quick Start - 5-minute introduction
- Providers Guide - Configure AI providers
- Tools Guide - Function calling
- Cost Tracking - Budget management
- TypeScript Quickstart - TypeScript-specific setup
- Production Guide - Enterprise deployment
- Performance Guide - Optimization
- Custom Cascade - Custom routing
- Custom Validation - Quality control
- Browser Cascading - Edge/browser deployment
Have a great use case? Contribute an example!
/**
* Your Example - Brief Description
*
* What it demonstrates:
* - Feature 1
* - Feature 2
*
* Requirements:
* - npm install @cascadeflow/core
* - export OPENAI_API_KEY="..."
*
* Setup:
* npm install @cascadeflow/core
* export OPENAI_API_KEY="..."
* npx tsx your-example.ts
*
* Expected Results:
* Description of output
*/
import { CascadeAgent } from '@cascadeflow/core';
async function main() {
console.log('='.repeat(80));
console.log('YOUR EXAMPLE TITLE');
console.log('='.repeat(80));
// Your code here
console.log('\nKEY TAKEAWAYS:');
console.log('- Takeaway 1');
console.log('- Takeaway 2');
}
main().catch(console.error);See CONTRIBUTING.md for guidelines.
📖 Complete Guides 🛠️ Tools Guide 💰 Cost Tracking Guide 🏭 Production Guide 📘 TypeScript Quickstart
💬 GitHub Discussions - Ask questions 🐛 GitHub Issues - Report bugs 💡 Use "question" label for general questions
Core (4): Basic usage, tool calling, multi-provider, reasoning models
Cost Management (1): Cost tracking
Quality & Validation (1): Semantic quality with ML
Production (1): Production patterns
- ✅ 7 TypeScript examples (~2,500+ lines of code)
- ✅ Comprehensive README (this file)
- ✅ Individual example READMEs (nodejs/README.md, browser/README.md)
- ✅ Full TypeScript definitions with IDE autocomplete
- ✅ 10+ comprehensive guides in main docs
Essential Concepts:
- ✅ Draft accepted = money saved
- ✅ Draft rejected = quality ensured
- ✅ PreRouter detects complexity before cascade
- ✅ Token-based pricing (input/output split)
- ✅ Semantic quality with embeddings
- ✅ Universal tool format
Production Ready:
- ✅ Error handling
- ✅ Rate limiting
- ✅ Caching
- ✅ Budget management
- ✅ ML-based validation
💰 Save 40-85% on AI costs with intelligent cascading! 🚀
View All Documentation • Python Examples • TypeScript Examples • GitHub Discussions