AI Provider Configuration
Eneo is model-agnostic and supports multiple AI providers through a unified routing layer. This guide covers architecture, configuration, and credential management for each supported provider.
Overview
Eneo supports the following AI providers:
- OpenAI (GPT-4o, GPT-4o-mini, GPT-5)
- Anthropic (Claude 3.5, Claude 3.7, Claude Haiku 4.5)
- Azure OpenAI Service (GPT-4o, GPT-5 variants)
- Google Gemini (Gemini 1.5, Gemini 2.0)
- Regional Providers (Berget AI, GDM - Swedish data residency)
- Self-hosted (vLLM, compatible OpenAI API servers)
You can configure multiple providers simultaneously and switch between them based on your needs.
Architecture
Eneo routes all AI requests through LiteLLM, a unified API layer that supports 100+ provider APIs with consistent interfaces for completions, embeddings, and streaming.
How Provider Routing Works
The routing system uses model name prefixes to determine the target provider:
| Prefix | Provider | Example |
|---|---|---|
azure/ | Azure OpenAI | azure/gpt-4o |
berget/ | Berget AI | berget/llama-3-70b |
gdm/ | GDM (Swedish) | gdm/mistral-large |
openai/ | OpenAI Direct | openai/gpt-4o |
| (no prefix) | Default provider | gpt-4o-mini |
LiteLLM Adapter (Recommended)
The LiteLLMModelAdapter is the recommended approach for all new model integrations:
# Models with litellm_model_name field use LiteLLM routing
completion_models:
- model_name: "gpt-4o-mini"
litellm_model_name: "azure/gpt-4o-mini" # Routes via LiteLLM
context_window: 128000Benefits:
- Unified API for 100+ providers
- Automatic retry and fallback logic
- Consistent streaming support
- Multi-tenant credential injection
Legacy Adapters (Deprecated): The following adapters are deprecated and maintained only for backward compatibility: OpenAIModelAdapter, ClaudeModelAdapter, AzureOpenAIModelAdapter, MistralModelAdapter, VLLMModelAdapter, OVHCloudModelAdapter. New models should use LiteLLM routing via the litellm_model_name field.
Model Configuration
Models are currently defined in ai_models.yml. This configuration approach will become more flexible in future updates.
# Example from ai_models.yml
completion_models:
- model_name: "gpt-5-azure"
litellm_model_name: "azure/gpt-5"
context_window: 400000
supports_json_output: true
supports_tool_use: true
supports_vision: true
supports_reasoning: true
description: "GPT-5 on Azure - Advanced reasoning with 400K context"For Contributors: To add new models, reference the GPT-5 implementation in ai_models.yml. Models with litellm_model_name automatically use the LiteLLM adapter, while models without it fall back to legacy adapters (not recommended for new models).
OpenAI
OpenAI provides access to GPT models including GPT-4o, GPT-4o-mini, and the latest GPT-5 series.
Configuration
Environment Variables
Add to your env_backend.env:
OPENAI_API_KEY=sk-...your-api-keyThen restart the backend:
docker compose restart backendAvailable Models
| Model | Context | Features | Status |
|---|---|---|---|
gpt-4o | 128K | Vision, Tool Use, JSON | Recommended |
gpt-4o-mini | 128K | Vision, Tool Use, JSON | Recommended (economical) |
gpt-4-turbo | 128K | Vision, Tool Use | Supported |
gpt-3.5-turbo | 16K | Basic | Deprecated |
gpt-3.5-turbo is deprecated and will be removed in a future update. Migrate to gpt-4o-mini for a cost-effective alternative with better performance.
Cost Considerations
- GPT-4o-mini: Best value for most use cases
- GPT-4o: Higher capability for complex tasks
- Monitor usage at OpenAI Usage Dashboard
Anthropic (Claude)
Anthropic’s Claude models offer strong performance with large context windows and advanced reasoning.
Configuration
Environment Variables
Add to your env_backend.env:
ANTHROPIC_API_KEY=sk-ant-...your-api-keyThen restart the backend:
docker compose restart backendAvailable Models
| Model | Context | Features | Status |
|---|---|---|---|
claude-3-7-sonnet-latest | 200K | Vision, Tool Use, Extended Thinking | Recommended |
claude-3-5-sonnet-latest | 200K | Vision, Tool Use | Recommended |
claude-haiku-4-5-20251001 | 200K | Vision, Tool Use | Recommended (fast) |
claude-3-opus | 200K | Vision, Tool Use | Supported |
claude-3-sonnet | 200K | Vision | Deprecated |
claude-3-haiku | 200K | Basic | Deprecated |
Claude 3.7 Sonnet includes extended thinking capabilities for complex reasoning tasks. Claude Haiku 4.5 provides excellent speed-to-quality ratio for high-throughput applications.
Key Features
- Large Context: Up to 200K tokens
- Extended Thinking: Deep reasoning mode (Claude 3.7)
- Vision: Image understanding and analysis
- Safety: Built-in constitutional AI safety
Azure OpenAI Service
Azure OpenAI provides OpenAI models through Microsoft Azure with enterprise features, including GPT-5 variants with advanced reasoning.
Configuration
Environment Variables
Add to your env_backend.env:
AZURE_OPENAI_API_KEY=your-azure-key
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_API_VERSION=2024-02-15-previewThen restart the backend:
docker compose restart backendAvailable Models
| Model | Context | Features | Status |
|---|---|---|---|
gpt-5-azure | 400K | Vision, Tool Use, Reasoning | Recommended |
gpt-5-mini-azure | 400K | Vision, Tool Use, Reasoning | Recommended (balanced) |
gpt-5-nano-azure | 400K | Vision, Tool Use, Reasoning | Recommended (fast) |
gpt-4o-azure | 128K | Vision, Tool Use, JSON | Supported |
gpt-4o-mini-azure | 128K | Vision, Tool Use, JSON | Supported |
GPT-5 on Azure: The GPT-5 series offers 400K token context windows and advanced reasoning capabilities. Models are available in three tiers: full (gpt-5), balanced (gpt-5-mini), and fast (gpt-5-nano).
Azure Setup Steps
- Go to Azure Portal
- Create an Azure OpenAI resource
- Navigate to Model deployments → Create
- Deploy desired models (e.g.,
gpt-4o,gpt-5) - Note your Endpoint and API Key from Keys and Endpoint
Benefits
- Enterprise SLA: Microsoft’s service level agreement
- Data Residency: Keep data in specific Azure regions (EU, Sweden available)
- Private Network: Use Azure Virtual Networks
- Compliance: SOC 2, ISO 27001, GDPR
Google Gemini
Google’s Gemini models offer multimodal capabilities with very large context windows.
Configuration
Environment Variables
Add to your env_backend.env:
GOOGLE_API_KEY=AIza...your-api-keyThen restart the backend:
docker compose restart backendAvailable Models
| Model | Context | Features | Status |
|---|---|---|---|
gemini-2.0-flash | 1M | Vision, Tool Use, Audio | Recommended |
gemini-1.5-pro | 1M | Vision, Tool Use | Supported |
gemini-1.5-flash | 1M | Vision, Tool Use | Supported (fast) |
gemini-pro | 32K | Basic | Deprecated |
Key Features
- Massive Context: Up to 1M tokens
- Multimodal: Process text, images, and audio
- Fast: Quick response times with Flash variants
Regional Providers (Swedish Data Residency)
For organizations requiring data to remain within Sweden or the EU, Eneo supports regional AI providers.
Berget AI
Swedish AI infrastructure with data residency guarantees.
curl -X PUT "https://api.your-domain.com/api/v1/sysadmin/tenants/{tenant_id}/credentials/berget" \
-H "X-API-Key: your-super-admin-api-key" \
-H "Content-Type: application/json" \
-d '{
"api_key": "your-berget-api-key",
"api_base": "https://api.berget.ai/v1"
}'Available Models:
berget/llama-3-70b- Llama 3 70B hosted in Swedenberget/mistral-large- Mistral Large with Swedish data residency
GDM (Government Digital Services)
Swedish government-approved AI infrastructure.
curl -X PUT "https://api.your-domain.com/api/v1/sysadmin/tenants/{tenant_id}/credentials/gdm" \
-H "X-API-Key: your-super-admin-api-key" \
-H "Content-Type: application/json" \
-d '{
"api_key": "your-gdm-api-key",
"api_base": "https://api.gdm.se/v1"
}'Regional providers are ideal for Swedish municipalities, government agencies, and organizations with strict data sovereignty requirements.
Self-Hosted Models (vLLM)
Run AI models on your own infrastructure for complete data privacy and control.
vLLM Configuration
vLLM provides high-performance inference with an OpenAI-compatible API.
Add to your env_backend.env:
VLLM_BASE_URL=http://your-vllm-server:8000/v1
VLLM_API_KEY=your-vllm-api-key # If authentication is enabledOr use the credential API:
curl -X PUT "https://api.your-domain.com/api/v1/sysadmin/tenants/{tenant_id}/credentials/vllm" \
-H "X-API-Key: your-super-admin-api-key" \
-H "Content-Type: application/json" \
-d '{
"api_key": "your-vllm-api-key",
"api_base": "http://your-vllm-server:8000/v1"
}'Hardware Requirements
| Model Size | VRAM Required | Recommended GPU |
|---|---|---|
| 7B | 16GB | RTX 4090, A10 |
| 13B | 32GB | A100-40GB |
| 70B | 140GB | 2x A100-80GB |
Benefits
- Complete Privacy: Data never leaves your infrastructure
- No API Costs: Only hardware/cloud compute costs
- Customizable: Fine-tune and deploy custom models
- Low Latency: No network round-trip to external APIs
Using Multiple Providers
You can configure multiple providers simultaneously. Users can then select their preferred model when creating assistants or chatting.
Environment variables approach:
# OpenAI
OPENAI_API_KEY=sk-...
# Anthropic
ANTHROPIC_API_KEY=sk-ant-...
# Azure OpenAI
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_ENDPOINT=...
# Google Gemini
GOOGLE_API_KEY=AIza...
# Self-hosted
VLLM_BASE_URL=http://localhost:8000/v1Credential API approach (recommended for multi-tenant):
Configure each provider per-tenant using the credential management API. See Credential Management below.
Credential Management
For multi-tenant deployments, API credentials are managed per-tenant through a secure API. Credentials are encrypted using Fernet (AES-128-CBC + HMAC-SHA256) before storage.
Authentication
All credential endpoints require the ENEO_SUPER_API_KEY header:
-H "X-API-Key: your-super-admin-api-key"Set Provider Credentials
curl -X PUT "https://api.your-domain.com/api/v1/sysadmin/tenants/{tenant_id}/credentials/{provider}" \
-H "X-API-Key: your-super-admin-api-key" \
-H "Content-Type: application/json" \
-d '{
"api_key": "your-provider-api-key",
"api_base": "https://api.provider.com/v1" # Optional, for custom endpoints
}'Supported providers: openai, anthropic, azure, google, mistral, berget, gdm, vllm
List Tenant Credentials
curl "https://api.your-domain.com/api/v1/sysadmin/tenants/{tenant_id}/credentials" \
-H "X-API-Key: your-super-admin-api-key"Response shows configured providers (credentials are masked):
{
"credentials": [
{"provider": "azure", "configured": true, "api_key": "****...****"},
{"provider": "anthropic", "configured": true, "api_key": "****...****"}
]
}Remove Provider Credentials
curl -X DELETE "https://api.your-domain.com/api/v1/sysadmin/tenants/{tenant_id}/credentials/{provider}" \
-H "X-API-Key: your-super-admin-api-key"Encryption Details
Credentials are encrypted at rest using Fernet encryption:
| Aspect | Details |
|---|---|
| Algorithm | Fernet (AES-128-CBC + HMAC-SHA256) |
| Key Source | ENCRYPTION_KEY environment variable |
| Storage Format | enc:fernet:v1:<ciphertext> |
Backup your ENCRYPTION_KEY: If lost, all stored credentials become unrecoverable and must be re-entered for each tenant.
Generate an encryption key:
uv run python -m intric.cli.generate_encryption_keyModel Selection Guidelines
Choose Azure OpenAI (GPT-5) if:
- You need the most advanced reasoning capabilities
- You require enterprise SLA and compliance
- You want 400K token context windows
- Data residency in EU/Sweden is important
Choose OpenAI Direct if:
- You want fast access to latest models
- You don’t need enterprise compliance features
- Cost efficiency with GPT-4o-mini is important
Choose Anthropic Claude if:
- You need extended thinking for complex analysis
- You want strong reasoning with 200K context
- You prefer Claude’s writing style and safety approach
Choose Google Gemini if:
- You need massive 1M token context
- Multimodal processing (text, images, audio) is required
- You want fast response times with Flash variants
Choose Regional Providers (Berget/GDM) if:
- Data must remain in Sweden
- You’re a Swedish municipality or government agency
- Compliance with Swedish data sovereignty requirements
Choose Self-Hosted (vLLM) if:
- Complete data privacy is paramount
- You want to eliminate API costs
- You have GPU infrastructure available
- You need to run custom fine-tuned models
Cost Management
Set Budget Alerts
For cloud providers, set up budget alerts:
- OpenAI: Usage limits in account settings
- Azure: Budget alerts in Azure Cost Management
- Google: Budget alerts in Google Cloud Console
Monitor Usage
Check your usage regularly:
- View API call metrics in Eneo’s admin panel
- Review provider dashboards
- Set up automated alerts
Optimize Costs
- Use smaller models (e.g., GPT-3.5) for simple tasks
- Implement caching for repeated queries
- Use local models for development/testing
- Set context length limits
Troubleshooting
Invalid API Key (401 Errors)
Symptoms: Authentication errors, 401 responses
Solutions:
- Verify the API key is copied correctly (no extra spaces)
- Check if the key has expired
- For credential API: Verify
ENEO_SUPER_API_KEYis correct - For Azure: Ensure endpoint URL ends with
/
Rate Limit Errors (429)
Symptoms: 429 status codes, “rate limit exceeded”
Solutions:
- Check your API tier limits with the provider
- Consider using multiple providers for load distribution
- Upgrade your API tier if needed
- For Azure: Check your TPM (tokens per minute) quota
Model Not Available
Symptoms: Model name not recognized, routing errors
Solutions:
- Verify model name matches
ai_models.ymlexactly - Check if
litellm_model_nameprefix is correct (e.g.,azure/gpt-4o) - Ensure credentials are configured for the target provider
- For new models: Add to
ai_models.ymlwith correct LiteLLM routing
Credential Decryption Errors
Symptoms: “Failed to decrypt credential” errors
Solutions:
- Verify
ENCRYPTION_KEYis set correctly in environment - If key was changed/lost, re-enter credentials via API
- Check that credentials were stored with the current encryption key
Connection Errors
For cloud providers:
- Check internet connectivity
- Verify firewall allows outbound HTTPS (443)
- Check provider status pages
For self-hosted (vLLM):
- Verify the vLLM server is running
- Check the URL and port configuration
- Ensure Docker network allows communication
Security Best Practices
-
Protect API Keys
- Never commit keys to version control
- Use the credential API for multi-tenant deployments (encrypted storage)
- Rotate keys regularly
- Use separate keys for development and production
-
Encryption Key Management
- Securely backup
ENCRYPTION_KEY - Store in secrets manager (AWS Secrets Manager, HashiCorp Vault)
- Never share or commit encryption keys
- Securely backup
-
Network Security
- Use HTTPS for all API connections
- For self-hosted: Use VPN or private networks
- Implement firewall rules for vLLM endpoints
-
Access Control
- Limit
ENEO_SUPER_API_KEYaccess to administrators - Use separate credentials per tenant
- Monitor credential API usage for anomalies
- Limit
-
Data Handling
- Review provider data policies before use
- Use regional providers (Berget/GDM) or self-hosted for sensitive data
- Consider data residency requirements
Need Help?
- Check the Audit Logging Guide
- Visit GitHub Issues
- Review provider documentation: