v19.6.0•December 25, 2025

New Provider 5 Launch & Provider 8 Tier Optimization

This major infrastructure update introduces a new high-performance infrastructure as Provider 5 with 41 cutting-edge models while retiring the previous legacy provider. Provider 8 receives comprehensive tier optimization with standardized context window allocation following the 50%/70%/100%/100% distribution rule across free/basic/pro/ultra tiers. Strategic tier restrictions enhance platform sustainability while maintaining broad model access.

⚠️ Provider 5 (Legacy) - Complete Retirement

Platform Discontinuation: Legacy Provider 5 has been completely removed from the platform. All 82 models have been discontinued.

Image Generation Models Removed (14 models)

provider-5/midjourney-v7 - Midjourney V7 image generation
provider-5/flux-fast - Black Forest Labs Flux Fast
provider-5/flux-pro - Black Forest Labs Flux Pro
provider-5/hidream-i1 - HiDream I1 image model
provider-5/hidream-i1-fast - HiDream I1 Fast variant
provider-5/qwen-image - Qwen image generation
provider-5/flux-krea - Flux Krea creative model
provider-5/imagen-4-fast - Google Imagen 4 Fast
provider-5/imagen-4 - Google Imagen 4
provider-5/gemini-2.5-flash-image - Gemini 2.5 Flash Image
provider-5/dall-e-2 - OpenAI DALL-E 2
provider-5/dall-e-3 - OpenAI DALL-E 3
provider-5/gpt-image-1 - OpenAI GPT Image 1
provider-5/gpt-image-1-mini - OpenAI GPT Image 1 Mini

Chat/Completion Models Removed (45 models)

OpenAI GPT Series:

OpenAI GPT-3.5 Series:

provider-5/gpt-3.5-turbo-0125, provider-5/gpt-3.5-turbo-instruct, provider-5/gpt-3.5-turbo-instruct-0914, provider-5/gpt-3.5-turbo-1106

OpenAI GPT-4.1 Series:

provider-5/gpt-4.1-2025-04-14, provider-5/gpt-4.1-mini, provider-5/gpt-4.1-mini-2025-04-14, provider-5/gpt-4.1-nano, provider-5/gpt-4.1-nano-2025-04-14

OpenAI GPT-5 Series:

OpenAI GPT-5.1 Series:

provider-5/gpt-5.1, provider-5/gpt-5.1-2025-11-13, provider-5/gpt-5.1-chat-latest, provider-5/gpt-5.1-codex, provider-5/gpt-5.1-codex-mini, provider-5/gpt-5.1-codex-max, provider-5/gpt-5.2

OpenAI o-Series:

provider-5/o3, provider-5/o3-mini, provider-5/o4-mini, provider-5/o3-2025-04-16, provider-5/o3-mini-2025-01-31, provider-5/o4-mini-2025-04-16

Audio Models Removed (18 models)

Embedding & Moderation Models Removed (5 models)

Provider 5 - New Infrastructure Launch

New Infrastructure Integration: A new high-performance infrastructure joins as Provider 5 with 41 high-performance models spanning chat, reasoning, embedding, and image generation capabilities.

Chat/Completion Models - Free Tier (22 models)

Meta Llama Family:

provider-5/meta-llama-3.1-8b-instruct-fast - Free Tier
- Ultra-fast 8B variant optimized for low-latency applications
- 131K context window
provider-5/meta-llama-3.1-8b-instruct - Free Tier
- Standard 8B instruction-tuned model
- 131K context window
provider-5/llama-guard-3-8b - Free Tier
- Safety and content moderation specialist
- 8K context window
provider-5/llama-3.3-70b-instruct - Free Tier
- 70B flagship with function calling support
- 131K context window
provider-5/llama-3.3-70b-instruct-fast - Free Tier
- Fast variant of 70B flagship
- Function calling enabled

NVIDIA Models:

provider-5/nemotron-nano-v2-12b - Free Tier
- NVIDIA's efficient 12B Nemotron Nano V2
- 65K context window

Google Gemma:

provider-5/gemma-2-2b-it - Free Tier
- Compact 2B instruction-tuned model
- 8K context window
provider-5/gemma-2-9b-it-fast - Free Tier
- Fast 9B Gemma 2 variant
- Optimized for throughput
provider-5/gemma-3-27b-it - Free Tier
- 27B instruction model with function calling
- 8K context window
provider-5/gemma-3-27b-it-fast - Free Tier
- Fast variant with function calling support

Qwen Series:

provider-5/qwen2.5-coder-7b-fast - Free Tier
- Fast coding specialist
- 131K context window
provider-5/qwen3-32b - Free Tier
- 32B flagship with function calling
- 131K context window
provider-5/qwen3-32b-fast - Free Tier
- Fast 32B variant with function calling
provider-5/qwen3-30b-a3b-instruct-2507 - Free Tier
- 30B MoE with function calling
- 131K context window

DeepSeek Reasoning:

provider-5/deepseek-r1-0528 - Free Tier
- R1 reasoning model with advanced logic
- 164K context window
provider-5/deepseek-r1-0528-fast - Free Tier
- Fast R1 variant with reasoning capabilities

Zhipu AI:

provider-5/glm-4.5-air - Free Tier
- Efficient GLM 4.5 with reasoning and function calling
- 128K context window

OpenAI Open-Source:

provider-5/gpt-oss-120b - Free Tier
- 120B open-source GPT with reasoning and function calling
- 131K context window
provider-5/gpt-oss-20b - Free Tier
- 20B efficient open-source GPT variant
- 131K context window

Moonshot AI:

provider-5/kimi-k2-instruct - Free Tier
- K2 instruction-tuned model
- 63K context window

NousResearch:

provider-5/hermes-4-70b - Free Tier
- 70B Hermes 4 with function calling
- 131K context window

PrimeIntellect:

provider-5/intellect-3 - Free Tier
- Intellect 3 reasoning model
- 131K context window

Chat/Completion Models - Basic Tier (11 models)

provider-5/llama-3.1-nemotron-ultra-253b - Basic Tier
- NVIDIA's 253B ultra-scale model with reasoning
- 131K context window
provider-5/qwen3-235b-a22b-instruct-2507 - Basic Tier
- 235B flagship with reasoning and function calling
- 262K context window
provider-5/qwen3-235b-a22b-thinking-2507 - Basic Tier
- 235B thinking variant with advanced reasoning
- 262K context window
provider-5/qwen2.5-vl-72b-instruct - Basic Tier
- 72B vision-language model
- 131K context window
provider-5/deepseek-v3-0324 - Basic Tier
- DeepSeek V3 with function calling
- 64K context window
provider-5/deepseek-v3-0324-fast - Basic Tier
- Fast DeepSeek V3 variant
provider-5/glm-4.5 - Basic Tier
- Full GLM 4.5 with reasoning and function calling
- 512K context window
provider-5/kimi-k2-thinking - Basic Tier
- K2 with enhanced reasoning capabilities
- 200K context (Ultra), 63K (Pro/Basic)
provider-5/qwen3-30b-a3b-thinking-2507 - Basic Tier
- 30B thinking variant with reasoning
provider-5/qwen3-coder-30b-a3b-instruct - Basic Tier
- 30B coding specialist
- 262K context window
provider-5/qwen3-next-80b-a3b-thinking - Basic Tier
- 80B next-gen thinking model with reasoning

Chat/Completion Models - Pro Tier (2 models)

provider-5/qwen3-coder-480b-a35b-instruct - Pro Tier
- 480B flagship coding model with reasoning
- 262K context window
provider-5/hermes-4-405b - Pro Tier
- 405B NousResearch flagship with reasoning and function calling
- 131K context window

Embedding Models (4 models) - All Tiers

provider-5/bge-en-icl - Free Tier
- BAAI 4096-dimension embeddings, 8K tokens
provider-5/bge-multilingual-gemma2 - Free Tier
- BAAI multilingual 2048-dimension embeddings, 8K tokens
provider-5/e5-mistral-7b-instruct - Free Tier
- Intfloat 4096-dimension embeddings, 32K tokens
provider-5/qwen3-embedding-8b - Free Tier
- Qwen 4096-dimension embeddings, 32K tokens

Image Generation Models (2 models) - All Tiers

provider-5/flux-dev - Free Tier
- Black Forest Labs development model
- Max Images: 1/2/4/4 per request (Free/Basic/Pro/Ultra)
provider-5/flux-schnell - Free Tier
- Black Forest Labs fast generation
- Max Images: 1/2/4/4 per request (Free/Basic/Pro/Ultra)

Provider 8 - Tier Optimization & Context Window Standardization

Infrastructure Update: Provider 8 models now follow standardized context window allocation with the 50%/70%/100%/100% distribution rule for free/basic/pro/ultra tiers. Strategic tier restrictions applied for resource optimization.

Free Tier Expansions (2 models)

provider-8/mimo-v2-flash - Free Tier
- Previously Basic Tier, now accessible to all users
- Context: 128K (Free) / 179K (Basic) / 256K (Pro/Ultra)
provider-8/qwen3-next-80b-a3b-instruct - Free Tier
- Previously Basic Tier, 80B MoE model now free
- Context: 66K (Free) / 92K (Basic) / 131K (Pro/Ultra)

Ultra Tier Restrictions (7 models)

Chat/Completion Models:

provider-8/claude-sonnet-4.5 - Ultra Tier Only
- Anthropic's flagship now exclusively Ultra tier
- 200K context window for Ultra subscribers
provider-8/gemini-3-pro - Ultra Tier Only
- Google's Gemini 3 Pro now Ultra exclusive
- 1M context window for maximum capability
provider-8/grok-4.1-fast-non-reasoning - Ultra Tier Only
- xAI's fast Grok now Ultra exclusive
- 2M context window for enterprise workloads

Image Generation Models:

provider-8/nano-banana-pro - Ultra Tier Only
- Premium image model now Ultra exclusive
- 4 images per request for Ultra tier
provider-8/seedream-4.5 - Ultra Tier Only
- Doubao Seedream 4.5 now Ultra exclusive
- 4 images per request for Ultra tier
provider-8/gpt-image-1.5 - Ultra Tier Only
- OpenAI GPT Image 1.5 now Ultra exclusive
- 4 images per request for Ultra tier

Pro Tier Restrictions (3 models)

provider-8/gemini-3-flash - Pro Tier
- Google's Gemini 3 Flash now Pro+ only
- 1M context window for Pro/Ultra tiers
provider-8/glm-4.7 - Pro Tier
- Zhipu AI GLM 4.7 now Pro+ only
- 200K context window
provider-8/glm-4.7-thinking - Pro Tier
- GLM 4.7 thinking variant now Pro+ only
- 200K context window

Basic Tier Restrictions (5 models)

provider-8/glm-4.6 - Basic Tier
- Removed from Free tier
- Context: 179K (Basic) / 256K (Pro/Ultra)
provider-8/glm-4.6-thinking - Basic Tier
- Removed from Free tier
- Context: 140K (Basic) / 200K (Pro/Ultra)
provider-8/glm-4.6v-thinking - Basic Tier
- Vision-thinking variant removed from Free tier
- Context: 140K (Basic) / 200K (Pro/Ultra)
provider-8/flux-2-pro - Basic Tier
- Removed from Free tier
- Max Images: 2/4/4 (Basic/Pro/Ultra)
provider-8/flux-2-flex - Basic Tier
- Removed from Free tier
- Max Images: 2/4/4 (Basic/Pro/Ultra)

Context Window Adjustments (14 models)

The following models received context window standardization without tier changes:

provider-8/gpt-oss-120b - Context: 64K/90K/128K/128K
provider-8/gpt-oss-20b - Context: 66K/92K/131K/131K
provider-8/gemini-2.0-flash - Context: 500K/700K/1M/1M
provider-8/kimi-k2-0905 - Context: 128K/179K/256K/256K
provider-8/kimi-k2 - Context: 32K/44K/63K/63K
provider-8/char - Context: 64K/90K/128K/128K
provider-8/kimi-k2-thinking - Context: 32K/44K/63K/63K
provider-8/deepseek-v3 - Context: 82K/115K/164K/164K
provider-8/deepseek-terminus - Context: 64K/90K/128K/128K
provider-8/seed-rp - Context: 64K/90K/128K/128K
provider-8/llama-4-scout - Context: 524K/734K/1M/1M
provider-8/llama-4-maverick - Context: 524K/734K/1M/1M
provider-8/glm-4.5-air - Context: 64K/90K/128K/128K
provider-8/glm-4.5 - Context: 256K/358K/512K/512K
provider-8/qwen3-235b - Context: 20K/29K/41K/41K

Context Window Distribution Rule

Standardized Allocation: All Provider 5 and Provider 8 models now follow consistent context window distribution across tiers:

Free Tier: 50% of maximum context window
Basic Tier: 70% of maximum context window
Pro Tier: 100% of maximum context window
Ultra Tier: 100% of maximum context window

This standardization ensures predictable behavior across all models while optimizing resource allocation for platform sustainability.

Platform Impact

New Provider Infrastructure: New Provider 5 integration brings 41 high-quality models with excellent uptime and performance characteristics
Free Tier Enhancement: 22 chat models and 2 image generation models now available on Free tier through Provider 5
Embedding Capabilities: 4 new embedding models provide comprehensive text embedding options for RAG applications
Premium Model Curation: Strategic tier restrictions for Gemini 3, Claude 4.5, and Grok 4.1 ensure sustainable access for premium subscribers
Multi-Provider Redundancy: Many removed models now available through the new Provider 5 infrastructure with enhanced stability

Important Notes

Provider 5 Migration: Users of previous Provider 5 models should migrate to the new Provider 5 or alternative providers. Model IDs have changed.
Context Window Changes: Free and Basic tier users will experience reduced context windows following the 50%/70% allocation rule. Plan usage accordingly.
Premium Model Access: Claude Sonnet 4.5, Gemini 3 Pro, and Grok 4.1 are now Ultra-exclusive. Use Provider 7 for Claude Sonnet 4.5 with broader tier access.
Image Generation Limits: Premium image models (Nano Banana Pro, Seedream 4.5, GPT Image 1.5) restricted to Ultra tier. Flux models in Provider 5 offer free alternatives.
Embedding Models: All 4 new embedding models are available across all tiers with competitive pricing.
GLM Series: GLM 4.7 and thinking variants now require Pro tier. GLM 4.6 series requires Basic tier minimum.