Skip to main content

Model Comparison

Schatzi AI provides access to 13+ active state-of-the-art AI models from leading providers. All models run on Swiss infrastructure, ensuring your data never leaves Switzerland.

This guide helps you choose the right model for your specific needs, understand pricing, and optimize costs.


Quick Model Selector

Choose by use case:

Your NeedRecommended Model
Daily chat & emailChat & Document Analysis - Medium
Document analysisChat & Document Analysis - Medium
Swiss compliance requiredApertus Swiss LLM
Complex reasoningReasoning & Problem Solving - Xtra Large
Web search & researchSearch, Chat & Analysis - Small
Multilingual chatLlama 3.3 Multi-lingual - Medium
Budget-consciousApertus Swiss LLM - Small

**Estimated for typical professional usage on Basic plan


Complete Model Reference

Chat & General Purpose Models

Chat & Document Analysis - Medium

  • Parameters: 24 Billion
  • Context Window: 128,000 tokens (~96,000 words)

Capabilities:

  • Versatile multimodal model
  • Vision and image analysis
  • Conversational agents
  • Strong contextual understanding
  • All major European languages

Pricing:

  • Input: ... per million tokens
  • Output: ... per million tokens

Best For:

  • Daily business communications
  • Email drafting and responses
  • General document analysis
  • Customer service responses
  • Quick content generation

Search, Chat & Analysis - Small

  • Parameters: ~17 Billion (Llama 4 Scout)
  • Context Window: Variable

Capabilities:

  • Optimized for web search and chat
  • Suitable for artists and content creation, including storytelling
  • Web search integration

Pricing:

  • Input: ... per million tokens
  • Output: ... per million tokens

Best For:

  • Web search and research tasks
  • Content creation and storytelling
  • Quick information retrieval

Chat & Document Analysis - Xtra Large

  • Parameters: 235 Billion (Mixture of Experts - 22B active)
  • Context Window: 128,000 tokens (~96,000 words)

Capabilities:

  • Very large-scale model, rivaling GPT-4 or Claude 3 Opus across a broad range of complex tasks
  • Advanced multilingual capabilities
  • Reasoning mode can be enabled to dynamically tailor responses to the context and complexity of queries
  • Document analysis excellence

Pricing:

  • Input: ... per million tokens
  • Output: ... per million tokens

Best For:

  • Complex document analysis
  • High-quality content generation
  • Advanced multilingual tasks
  • Tasks requiring GPT-4 level performance
  • Dynamic reasoning tasks

Why Choose This Model:

  • Premium performance at competitive pricing
  • Advanced reasoning capabilities
  • Multilingual excellence

Swiss LLM Models (🇨🇭 AI Act Compliant)

Apertus Swiss LLM - Large (70B)

  • Parameters: 70 Billion
  • Context Window: 65,536 tokens (~49,000 words)
  • Max Output: 16,384 tokens

Capabilities:

  • Fully documented and transparent
  • AI Act compliant
  • Respectful of privacy and intellectual property
  • Performance on par with market leaders
  • Optimized for German, French, Italian, English

Pricing:

  • Input: ... per million tokens
  • Output: ... per million tokens

Best For:

  • Government agencies
  • Swiss financial services
  • R&D teams requiring transparency
  • Multilingual European services
  • Compliance-heavy industries

Why Choose Apertus:

  • Swiss-developed and Swiss-hosted
  • Fully AI Act compliant
  • Complete transparency and documentation
  • Privacy-first architecture
  • Multilingual European focus

Apertus Swiss LLM - Small (8B)

  • Parameters: 8 Billion
  • Context Window: 32,768 tokens (~24,500 words)
  • Max Output: 8,192 tokens

Capabilities:

  • Optimized for multilingual dialogue
  • Fast response times
  • Efficient for routine tasks
  • Same compliance as Large model

Pricing:

  • Input: ... per million tokens
  • Output: ... per million tokens

Best For:

  • High-volume, routine conversations
  • Budget-conscious Swiss compliance needs
  • Quick responses with Swiss data protection
  • Multilingual customer support

Cost Comparison:

  • Much cheaper than Apertus Large
  • Perfect for high-frequency, simple tasks

Reasoning & Problem-Solving Models

Reasoning & Problem Solving - Xtra Large

  • Parameters: 670 Billion
  • Context Window: 65,536 tokens
  • Max Output: 16,384 tokens

Capabilities:

  • Advanced reasoning chat completions
  • Complex problem-solving
  • Multi-step logical analysis
  • Strategic planning
  • Deep technical understanding

Pricing:

  • Input: ... per million tokens
  • Output: ... per million tokens
  • Note: Premium pricing for advanced reasoning

Best For:

  • Complex strategic analysis
  • Advanced research tasks
  • Multi-step problem decomposition
  • Technical architecture decisions
  • Financial modeling and forecasting

When to Use:

  • Tasks requiring deep reasoning
  • When accuracy is critical
  • Complex business decisions
  • Advanced technical problems
Premium Model

This is our most expensive model. Use for complex reasoning tasks where the quality justifies the higher cost.


Reasoning & Problem Solving - Medium

  • Parameters: 32 Billion
  • Context Window: 32,768 tokens (~24,500 words)
  • Max Output: 8,192 tokens

Capabilities:

  • Optimized for thinking and reasoning
  • Strong problem-solving abilities
  • Cost-effective reasoning model
  • Multi-step analysis

Pricing:

  • Input: ... per million tokens
  • Output: ... per million tokens

Best For:

  • Mid-complexity reasoning tasks
  • Problem-solving at lower cost than XL
  • Logical analysis
  • Structured thinking tasks

Cost Comparison:

  • Much cheaper than Reasoning - XL
  • Better reasoning than general chat models
  • Balanced performance-to-cost ratio

Fast Reasoning & Problem Solving - Small

  • Parameters: 8 Billion
  • Context Window: 32,768 tokens (~24,500 words)
  • Max Output: 8,192 tokens

Capabilities:

  • Optimized for thinking and reasoning
  • Fast response times
  • Efficient for quick problem-solving
  • Function calling support

Pricing:

  • Input: ... per million tokens
  • Output: ... per million tokens

Best For:

  • High-volume reasoning tasks
  • Quick logical analysis
  • Budget-conscious problem-solving
  • Rapid iteration on solutions

Cost Comparison:

  • Most affordable reasoning model
  • Much cheaper than Reasoning - XL
  • Perfect for high-frequency reasoning tasks

Reasoning & Agent tasks - Xtra Large

  • Parameters: 120 Billion
  • Context Window: 32,768 tokens (~24,500 words)
  • Max Output: 8,192 tokens

Capabilities:

  • Optimized for powerful reasoning
  • Agentic task execution
  • Versatile developer use cases
  • Function calling support
  • Data analysis capabilities

Pricing:

  • Input: ... per million tokens
  • Output: ... per million tokens

Best For:

  • Agentic workflows
  • Automated task execution
  • Complex data analysis
  • Developer tools and automation
  • Multi-step reasoning at lower cost

Why Choose This Model:

  • Excellent value for advanced reasoning
  • Strong agent capabilities
  • Function calling for automation
  • More affordable than premium reasoning models

Kimi K2 - Chat

  • Context Window: Variable

Capabilities:

  • Optimized for multilingual dialogue use cases
  • Strong conversational abilities

Pricing:

  • Input: ... per million tokens
  • Output: ... per million tokens

Best For:

  • General chat and conversation
  • Multilingual dialogue

Vision & Document Analysis Models

Document Analysis - Small

  • Parameters: 12 Billion
  • Context Window: 32,768 tokens (~24,500 words)
  • Max Output: 8,192 tokens
  • Vision: ✅ Supports image analysis

Capabilities:

  • Optimized for handling text and image input
  • Multimodal - processes both text and images
  • Document analysis with visual elements
  • Chart and diagram interpretation
  • Screenshot analysis

Pricing:

  • Input: ... per million tokens
  • Output: ... per million tokens

Best For:

  • Document analysis with images
  • PDF processing with charts/diagrams
  • Screenshot interpretation
  • Visual content analysis
  • Forms and invoice processing

Document Analysis - Xtra Small

  • Parameters: 2 Billion
  • Context Window: 16,384 tokens (~12,000 words)
  • Max Output: 4,096 tokens
  • Vision: ✅ Supports image analysis

Capabilities:

  • Compact and efficient vision-language model
  • Fast processing of images and text
  • Budget-friendly multimodal option
  • Quick visual analysis

Pricing:

  • Input: ... per million tokens
  • Output: ... per million tokens

Best For:

  • High-volume image processing
  • Quick visual QA
  • Simple document scans
  • Receipt and form analysis
  • Budget-conscious vision tasks

Cost Comparison:

  • Most affordable vision model
  • Perfect for simple visual tasks at scale

Multilingual Models

Llama 3.3 Multi-lingual - Medium

  • Parameters: 70 Billion
  • Context Window: 131,072 tokens (~98,000 words)
  • Max Output: 8,192 tokens

Capabilities:

  • Optimized for multilingual dialogue
  • Strong European language support
  • Natural conversation flow
  • Cultural context awareness

Pricing:

  • Input: ... per million tokens
  • Output: ... per million tokens

Best For:

  • Multilingual customer support
  • Cross-border communication
  • Translation with context
  • International business

Supported Languages:

  • German, French, Italian (Swiss variants)
  • English, Spanish, Portuguese
  • Dutch, Polish, Czech
  • And 50+ more languages

Complete Pricing Overview

All Active Models - Price Comparison

Loading prices...

Best Value Models

Overall Best Value:

  • Chat & Document Analysis - Xtra Large
    • GPT-4 level performance at competitive pricing
    • Great for complex document analysis

Best Budget Option:

  • Fast Reasoning & Problem Solving - Small
    • Most affordable reasoning model
    • Perfect for high-frequency use

Best Swiss Compliance:

  • Apertus Swiss LLM - Small
    • Full AI Act compliance
    • More affordable than Apertus Large

Best Vision Model:

  • Document Analysis - Xtra Small
    • Most affordable vision capabilities
    • Perfect for document scanning at scale

Cost Estimation

Typical Task Types

Task TypeToken UsageRecommended Model
Email response500 input + 300 outputChat & Document Analysis - Medium
10-page document summary10K input + 1K outputChat & Document Analysis - Medium
Contract analysis (30 pages)30K input + 2K outputApertus Swiss LLM - Large
Complex reasoning task5K input + 3K outputReasoning & Problem Solving - Xtra Large
Multilingual chat (hour)15K input + 10K outputLlama 3.3 Multi-lingual - Medium
Cost Optimization

Choose the right model for each task. Use smaller, cheaper models for routine work and reserve premium models for complex analysis. See the pricing table above for current rates.


Choosing the Right Model

Decision Framework

1. Task Complexity

  • Simple/Routine → Apertus Small, Chat & Document Medium
  • Complex/Technical → Apertus Large, Chat & Document Xtra Large
  • Advanced Reasoning → Reasoning - XL

2. Language Requirements

  • German/French/Italian primary → Apertus models
  • Multilingual → Llama 3.3
  • English-focused → Chat & Document Medium

3. Compliance Needs

  • AI Act compliance required → Apertus models
  • Swiss data sovereignty → All models (Swiss-hosted)
  • Maximum transparency → Apertus models

4. Budget Constraints

  • Minimal cost → Apertus Swiss LLM - Small
  • Balanced → Chat & Document Analysis - Medium
  • Premium quality → Reasoning & Problem Solving - Xtra Large

Model Comparison Matrix

FeatureChat & Doc MediumApertus LargeApertus SmallReasoning - XLLlama 3.3
Swiss LLM
AI Act Compliant⚠️⚠️⚠️
Multilingual✅✅✅✅✅✅
Reasoning⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐
Coding⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐
Speed⚡⚡⚡⚡⚡⚡⚡⚡⚡⚡⚡⚡⚡⚡
Context Window128K65K32K65K131K

🎓 Best Practices

Cost Optimization Strategies

1. Match Model to Task

✅ Good: Use Chat & Document Medium for email responses
❌ Bad: Use Reasoning - XL for simple emails
Savings: 17x cost reduction

2. Optimize Prompt Length

❌ Inefficient: Long, repetitive context
✅ Efficient: Concise, specific prompts
Savings: 30-50% token reduction

3. Batch Similar Tasks

✅ Good: Process 10 documents in one conversation
❌ Bad: Start new chat for each document
Savings: Reduce redundant context

4. Use Smaller Models When Possible

For routine tasks: Apertus Small
For complex analysis: Apertus Large
For deep reasoning: Reasoning - XL

Quality Optimization

When to Use Premium Models:

  • Critical business decisions
  • Complex technical problems
  • High-stakes legal/financial analysis
  • Advanced reasoning requirements

When Budget Models Are Sufficient:

  • Routine correspondence
  • Simple summaries
  • Basic translations
  • FAQ responses
  • Content drafts (can be refined with premium model)

Model Availability

Currently Active Models

Chat & General Purpose:

  • Chat & Document Analysis - Medium (Mistral 24B)
  • Search, Chat & Analysis - Small (Llama 4 Scout 17B)
  • Chat & Document Analysis - Xtra Large (Qwen3-VL 235B)
  • Kimi K2 - Chat

Swiss LLM (AI Act Compliant):

  • Apertus Swiss LLM - Large (70B)
  • Apertus Swiss LLM - Small (8B)

Reasoning & Problem-Solving:

  • Reasoning & Problem Solving - Xtra Large (DeepSeek R1 670B)
  • Reasoning & Problem Solving - Medium (QwQ 32B)
  • Fast Reasoning & Problem Solving - Small (Qwen3 8B)
  • Reasoning & Agent tasks - Xtra Large (GPT-OSS 120B)

Vision & Document Analysis:

  • Document Analysis - Small (Gemma 12B) - Vision enabled
  • Document Analysis - Xtra Small (Granite 2B) - Vision enabled

Multilingual:

  • Llama 3.3 Multi-lingual - Medium (70B)
Model Updates

We regularly add new models and update existing ones. Check your dashboard for the latest available models and current pricing.


Learn More:

Optimize Usage:


FAQ

Q: Can I switch models mid-conversation? A: Yes! You can change models at any time. The conversation context carries over (note: very long contexts may be truncated for smaller models).

Q: Which model is best for Swiss legal documents? A: Apertus Swiss LLM (Large or Small) - they're AI Act compliant and optimized for Swiss languages.

Q: What's the cheapest way to process 100 documents? A: Use Apertus Swiss LLM - Small for routine processing, escalate to Apertus Large or Chat & Document Analysis - Medium for complex analysis. Check the pricing table above for current rates.

Q: Do all models support document upload? A: Yes, all active models support document analysis. Some models have vision capabilities for image analysis.

Q: How do I track which model costs what? A: Your usage dashboard shows token usage and costs broken down by model. Navigate to Account → Usage & Billing.

Q: Can I set spending limits per model? A: Not yet, but you can set overall monthly spending limits. Model-specific limits are on our roadmap.


Get Started

Ready to choose your model?

  1. Log in to your Schatzi AI account
  2. Start a new chat in OpenWebUI
  3. Click the model selector at the top
  4. Choose the right model for your task
  5. Start chatting!

Need help? Contact Support | View Pricing Plans