Skip to Content
GuidesAI Provider Configuration

AI Provider Configuration

Eneo is model-agnostic and supports multiple AI providers through a unified routing layer. This guide covers architecture, configuration, and credential management for each supported provider.

Overview

Eneo supports the following AI providers:

  • OpenAI (GPT-4o, GPT-4o-mini, GPT-5)
  • Anthropic (Claude 3.5, Claude 3.7, Claude Haiku 4.5)
  • Azure OpenAI Service (GPT-4o, GPT-5 variants)
  • Google Gemini (Gemini 1.5, Gemini 2.0)
  • Regional Providers (Berget AI, GDM - Swedish data residency)
  • Self-hosted (vLLM, compatible OpenAI API servers)

You can configure multiple providers simultaneously and switch between them based on your needs.


Architecture

Eneo routes all AI requests through LiteLLM, a unified API layer that supports 100+ provider APIs with consistent interfaces for completions, embeddings, and streaming.

LiteLLM Architecture Overview showing Service Layer (CompletionService, EmbeddingService) connecting through Adapter Layer (LiteLLMModelAdapter, LiteLLMEmbeddingAdapter) to LiteLLM Core with unified API supporting 100+ providers. Shows AI Providers categorized as Cloud (OpenAI, Azure, Anthropic, Gemini, Mistral), Regional/Swedish (Berget, GDM), and Self-hosted (vLLM). Provider Registry handles provider detection and Credential Resolver manages multi-tenant credentials.

How Provider Routing Works

LiteLLM Provider Routing showing how Eneo routes AI requests through LiteLLM to various providers (OpenAI, Azure, Anthropic, local models) with automatic provider detection and credential management.

The routing system uses model name prefixes to determine the target provider:

PrefixProviderExample
azure/Azure OpenAIazure/gpt-4o
berget/Berget AIberget/llama-3-70b
gdm/GDM (Swedish)gdm/mistral-large
openai/OpenAI Directopenai/gpt-4o
(no prefix)Default providergpt-4o-mini

The LiteLLMModelAdapter is the recommended approach for all new model integrations:

# Models with litellm_model_name field use LiteLLM routing completion_models: - model_name: "gpt-4o-mini" litellm_model_name: "azure/gpt-4o-mini" # Routes via LiteLLM context_window: 128000

Benefits:

  • Unified API for 100+ providers
  • Automatic retry and fallback logic
  • Consistent streaming support
  • Multi-tenant credential injection

Legacy Adapters (Deprecated): The following adapters are deprecated and maintained only for backward compatibility: OpenAIModelAdapter, ClaudeModelAdapter, AzureOpenAIModelAdapter, MistralModelAdapter, VLLMModelAdapter, OVHCloudModelAdapter. New models should use LiteLLM routing via the litellm_model_name field.

Model Configuration

Models are currently defined in ai_models.yml. This configuration approach will become more flexible in future updates.

# Example from ai_models.yml completion_models: - model_name: "gpt-5-azure" litellm_model_name: "azure/gpt-5" context_window: 400000 supports_json_output: true supports_tool_use: true supports_vision: true supports_reasoning: true description: "GPT-5 on Azure - Advanced reasoning with 400K context"

For Contributors: To add new models, reference the GPT-5 implementation in ai_models.yml. Models with litellm_model_name automatically use the LiteLLM adapter, while models without it fall back to legacy adapters (not recommended for new models).


OpenAI

OpenAI provides access to GPT models including GPT-4o, GPT-4o-mini, and the latest GPT-5 series.

Configuration

Add to your env_backend.env:

OPENAI_API_KEY=sk-...your-api-key

Then restart the backend:

docker compose restart backend

Available Models

ModelContextFeaturesStatus
gpt-4o128KVision, Tool Use, JSONRecommended
gpt-4o-mini128KVision, Tool Use, JSONRecommended (economical)
gpt-4-turbo128KVision, Tool UseSupported
gpt-3.5-turbo16KBasicDeprecated

gpt-3.5-turbo is deprecated and will be removed in a future update. Migrate to gpt-4o-mini for a cost-effective alternative with better performance.

Cost Considerations

  • GPT-4o-mini: Best value for most use cases
  • GPT-4o: Higher capability for complex tasks
  • Monitor usage at OpenAI Usage Dashboard 

Anthropic (Claude)

Anthropic’s Claude models offer strong performance with large context windows and advanced reasoning.

Configuration

Add to your env_backend.env:

ANTHROPIC_API_KEY=sk-ant-...your-api-key

Then restart the backend:

docker compose restart backend

Available Models

ModelContextFeaturesStatus
claude-3-7-sonnet-latest200KVision, Tool Use, Extended ThinkingRecommended
claude-3-5-sonnet-latest200KVision, Tool UseRecommended
claude-haiku-4-5-20251001200KVision, Tool UseRecommended (fast)
claude-3-opus200KVision, Tool UseSupported
claude-3-sonnet200KVisionDeprecated
claude-3-haiku200KBasicDeprecated

Claude 3.7 Sonnet includes extended thinking capabilities for complex reasoning tasks. Claude Haiku 4.5 provides excellent speed-to-quality ratio for high-throughput applications.

Key Features

  • Large Context: Up to 200K tokens
  • Extended Thinking: Deep reasoning mode (Claude 3.7)
  • Vision: Image understanding and analysis
  • Safety: Built-in constitutional AI safety

Azure OpenAI Service

Azure OpenAI provides OpenAI models through Microsoft Azure with enterprise features, including GPT-5 variants with advanced reasoning.

Configuration

Add to your env_backend.env:

AZURE_OPENAI_API_KEY=your-azure-key AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/ AZURE_OPENAI_API_VERSION=2024-02-15-preview

Then restart the backend:

docker compose restart backend

Available Models

ModelContextFeaturesStatus
gpt-5-azure400KVision, Tool Use, ReasoningRecommended
gpt-5-mini-azure400KVision, Tool Use, ReasoningRecommended (balanced)
gpt-5-nano-azure400KVision, Tool Use, ReasoningRecommended (fast)
gpt-4o-azure128KVision, Tool Use, JSONSupported
gpt-4o-mini-azure128KVision, Tool Use, JSONSupported

GPT-5 on Azure: The GPT-5 series offers 400K token context windows and advanced reasoning capabilities. Models are available in three tiers: full (gpt-5), balanced (gpt-5-mini), and fast (gpt-5-nano).

Azure Setup Steps

  1. Go to Azure Portal 
  2. Create an Azure OpenAI resource
  3. Navigate to Model deploymentsCreate
  4. Deploy desired models (e.g., gpt-4o, gpt-5)
  5. Note your Endpoint and API Key from Keys and Endpoint

Benefits

  • Enterprise SLA: Microsoft’s service level agreement
  • Data Residency: Keep data in specific Azure regions (EU, Sweden available)
  • Private Network: Use Azure Virtual Networks
  • Compliance: SOC 2, ISO 27001, GDPR

Google Gemini

Google’s Gemini models offer multimodal capabilities with very large context windows.

Configuration

Add to your env_backend.env:

GOOGLE_API_KEY=AIza...your-api-key

Then restart the backend:

docker compose restart backend

Available Models

ModelContextFeaturesStatus
gemini-2.0-flash1MVision, Tool Use, AudioRecommended
gemini-1.5-pro1MVision, Tool UseSupported
gemini-1.5-flash1MVision, Tool UseSupported (fast)
gemini-pro32KBasicDeprecated

Key Features

  • Massive Context: Up to 1M tokens
  • Multimodal: Process text, images, and audio
  • Fast: Quick response times with Flash variants

Regional Providers (Swedish Data Residency)

For organizations requiring data to remain within Sweden or the EU, Eneo supports regional AI providers.

Berget AI

Swedish AI infrastructure with data residency guarantees.

curl -X PUT "https://api.your-domain.com/api/v1/sysadmin/tenants/{tenant_id}/credentials/berget" \ -H "X-API-Key: your-super-admin-api-key" \ -H "Content-Type: application/json" \ -d '{ "api_key": "your-berget-api-key", "api_base": "https://api.berget.ai/v1" }'

Available Models:

  • berget/llama-3-70b - Llama 3 70B hosted in Sweden
  • berget/mistral-large - Mistral Large with Swedish data residency

GDM (Government Digital Services)

Swedish government-approved AI infrastructure.

curl -X PUT "https://api.your-domain.com/api/v1/sysadmin/tenants/{tenant_id}/credentials/gdm" \ -H "X-API-Key: your-super-admin-api-key" \ -H "Content-Type: application/json" \ -d '{ "api_key": "your-gdm-api-key", "api_base": "https://api.gdm.se/v1" }'

Regional providers are ideal for Swedish municipalities, government agencies, and organizations with strict data sovereignty requirements.


Self-Hosted Models (vLLM)

Run AI models on your own infrastructure for complete data privacy and control.

vLLM Configuration

vLLM provides high-performance inference with an OpenAI-compatible API.

Add to your env_backend.env:

VLLM_BASE_URL=http://your-vllm-server:8000/v1 VLLM_API_KEY=your-vllm-api-key # If authentication is enabled

Or use the credential API:

curl -X PUT "https://api.your-domain.com/api/v1/sysadmin/tenants/{tenant_id}/credentials/vllm" \ -H "X-API-Key: your-super-admin-api-key" \ -H "Content-Type: application/json" \ -d '{ "api_key": "your-vllm-api-key", "api_base": "http://your-vllm-server:8000/v1" }'

Hardware Requirements

Model SizeVRAM RequiredRecommended GPU
7B16GBRTX 4090, A10
13B32GBA100-40GB
70B140GB2x A100-80GB

Benefits

  • Complete Privacy: Data never leaves your infrastructure
  • No API Costs: Only hardware/cloud compute costs
  • Customizable: Fine-tune and deploy custom models
  • Low Latency: No network round-trip to external APIs

Using Multiple Providers

You can configure multiple providers simultaneously. Users can then select their preferred model when creating assistants or chatting.

Environment variables approach:

# OpenAI OPENAI_API_KEY=sk-... # Anthropic ANTHROPIC_API_KEY=sk-ant-... # Azure OpenAI AZURE_OPENAI_API_KEY=... AZURE_OPENAI_ENDPOINT=... # Google Gemini GOOGLE_API_KEY=AIza... # Self-hosted VLLM_BASE_URL=http://localhost:8000/v1

Credential API approach (recommended for multi-tenant):

Configure each provider per-tenant using the credential management API. See Credential Management below.


Credential Management

For multi-tenant deployments, API credentials are managed per-tenant through a secure API. Credentials are encrypted using Fernet (AES-128-CBC + HMAC-SHA256) before storage.

Authentication

All credential endpoints require the ENEO_SUPER_API_KEY header:

-H "X-API-Key: your-super-admin-api-key"

Set Provider Credentials

curl -X PUT "https://api.your-domain.com/api/v1/sysadmin/tenants/{tenant_id}/credentials/{provider}" \ -H "X-API-Key: your-super-admin-api-key" \ -H "Content-Type: application/json" \ -d '{ "api_key": "your-provider-api-key", "api_base": "https://api.provider.com/v1" # Optional, for custom endpoints }'

Supported providers: openai, anthropic, azure, google, mistral, berget, gdm, vllm

List Tenant Credentials

curl "https://api.your-domain.com/api/v1/sysadmin/tenants/{tenant_id}/credentials" \ -H "X-API-Key: your-super-admin-api-key"

Response shows configured providers (credentials are masked):

{ "credentials": [ {"provider": "azure", "configured": true, "api_key": "****...****"}, {"provider": "anthropic", "configured": true, "api_key": "****...****"} ] }

Remove Provider Credentials

curl -X DELETE "https://api.your-domain.com/api/v1/sysadmin/tenants/{tenant_id}/credentials/{provider}" \ -H "X-API-Key: your-super-admin-api-key"

Encryption Details

Credentials are encrypted at rest using Fernet encryption:

AspectDetails
AlgorithmFernet (AES-128-CBC + HMAC-SHA256)
Key SourceENCRYPTION_KEY environment variable
Storage Formatenc:fernet:v1:<ciphertext>

Backup your ENCRYPTION_KEY: If lost, all stored credentials become unrecoverable and must be re-entered for each tenant.

Generate an encryption key:

uv run python -m intric.cli.generate_encryption_key

Model Selection Guidelines

Choose Azure OpenAI (GPT-5) if:

  • You need the most advanced reasoning capabilities
  • You require enterprise SLA and compliance
  • You want 400K token context windows
  • Data residency in EU/Sweden is important

Choose OpenAI Direct if:

  • You want fast access to latest models
  • You don’t need enterprise compliance features
  • Cost efficiency with GPT-4o-mini is important

Choose Anthropic Claude if:

  • You need extended thinking for complex analysis
  • You want strong reasoning with 200K context
  • You prefer Claude’s writing style and safety approach

Choose Google Gemini if:

  • You need massive 1M token context
  • Multimodal processing (text, images, audio) is required
  • You want fast response times with Flash variants

Choose Regional Providers (Berget/GDM) if:

  • Data must remain in Sweden
  • You’re a Swedish municipality or government agency
  • Compliance with Swedish data sovereignty requirements

Choose Self-Hosted (vLLM) if:

  • Complete data privacy is paramount
  • You want to eliminate API costs
  • You have GPU infrastructure available
  • You need to run custom fine-tuned models

Cost Management

Set Budget Alerts

For cloud providers, set up budget alerts:

  • OpenAI: Usage limits in account settings
  • Azure: Budget alerts in Azure Cost Management
  • Google: Budget alerts in Google Cloud Console

Monitor Usage

Check your usage regularly:

  • View API call metrics in Eneo’s admin panel
  • Review provider dashboards
  • Set up automated alerts

Optimize Costs

  • Use smaller models (e.g., GPT-3.5) for simple tasks
  • Implement caching for repeated queries
  • Use local models for development/testing
  • Set context length limits

Troubleshooting

Invalid API Key (401 Errors)

Symptoms: Authentication errors, 401 responses

Solutions:

  • Verify the API key is copied correctly (no extra spaces)
  • Check if the key has expired
  • For credential API: Verify ENEO_SUPER_API_KEY is correct
  • For Azure: Ensure endpoint URL ends with /

Rate Limit Errors (429)

Symptoms: 429 status codes, “rate limit exceeded”

Solutions:

  • Check your API tier limits with the provider
  • Consider using multiple providers for load distribution
  • Upgrade your API tier if needed
  • For Azure: Check your TPM (tokens per minute) quota

Model Not Available

Symptoms: Model name not recognized, routing errors

Solutions:

  • Verify model name matches ai_models.yml exactly
  • Check if litellm_model_name prefix is correct (e.g., azure/gpt-4o)
  • Ensure credentials are configured for the target provider
  • For new models: Add to ai_models.yml with correct LiteLLM routing

Credential Decryption Errors

Symptoms: “Failed to decrypt credential” errors

Solutions:

  • Verify ENCRYPTION_KEY is set correctly in environment
  • If key was changed/lost, re-enter credentials via API
  • Check that credentials were stored with the current encryption key

Connection Errors

For cloud providers:

  • Check internet connectivity
  • Verify firewall allows outbound HTTPS (443)
  • Check provider status pages

For self-hosted (vLLM):

  • Verify the vLLM server is running
  • Check the URL and port configuration
  • Ensure Docker network allows communication

Security Best Practices

  1. Protect API Keys

    • Never commit keys to version control
    • Use the credential API for multi-tenant deployments (encrypted storage)
    • Rotate keys regularly
    • Use separate keys for development and production
  2. Encryption Key Management

    • Securely backup ENCRYPTION_KEY
    • Store in secrets manager (AWS Secrets Manager, HashiCorp Vault)
    • Never share or commit encryption keys
  3. Network Security

    • Use HTTPS for all API connections
    • For self-hosted: Use VPN or private networks
    • Implement firewall rules for vLLM endpoints
  4. Access Control

    • Limit ENEO_SUPER_API_KEY access to administrators
    • Use separate credentials per tenant
    • Monitor credential API usage for anomalies
  5. Data Handling

    • Review provider data policies before use
    • Use regional providers (Berget/GDM) or self-hosted for sensitive data
    • Consider data residency requirements

Need Help?

Last updated on