v19.4.0•December 9, 2025

Gemini 3 Stability Fix, Grok 4 Series & Platform Refinements

Critical stability improvements for Gemini 3.0 and Gemini 2.5 Pro restore full functionality with beta thinking token support. The Grok 4 series debuts on Ultra tier alongside GPT-5.1 expansion across Pro and Ultra tiers. This update also performs strategic catalog optimization with 29 open-source model removals from Provider 2 and resolves image generation timeout issues.

Provider 2

Model Additions:

provider-2/gemini-2.5-flash-lite
- Efficient lightweight variant for high-throughput applications
- Optimized for speed and latency-sensitive workloads

Open Source Image Models Removed (15 models):

provider-2/dreamlike-photoreal-2.0 - Dreamlike Art
provider-2/animagine-xl - Cagliostro Research Lab
provider-2/animagine-xl-2.0 - Cagliostro Research Lab
provider-2/juggernaut-x-hyper - RunDiffusion
provider-2/sdxl-flash - Stability AI SD-based
provider-2/sdxxxl - Stability AI SD-based
provider-2/playground-v2.5 - Playground AI
provider-2/dreamshaper - Lykon
provider-2/stable-diffusion-v1-5 - Stability AI
provider-2/opencole-sdxl - OpenCole
provider-2/babes-by-stable-yogi-xl - Stable Yogi
provider-2/realism-by-stable-yogi - Stable Yogi
provider-2/realism-illustrious - XpucT
provider-2/proteus-v0.2 - DataAutomation
provider-2/nsfw-gen - Corcelio
provider-2/kivotos-xl - Kivotos
provider-2/fluently-xl - Fluently AI
provider-2/realvis-xl - SG161222

Open Source Chat Models Removed (14 models):

Meta: provider-2/llama-3.2-11b-vision-instruct
Microsoft: provider-2/phi-3-medium-128k-instruct, provider-2/dialogpt-medium, provider-2/dialogpt-large
Mistral: provider-2/codestral-2508, provider-2/mistral-medium-2505, provider-2/mistral-nemo-instruct-2407
DeepSeek: provider-2/deepseek-r1-distill-qwen-1.5b, provider-2/deepseek-math-7b-instruct
Google: provider-2/gemma-3-4b-it
Qwen: provider-2/qwen2.5-coder-14b-instruct, provider-2/qwen3-4b-instruct-2507, provider-2/qwen3-1.7b
NVIDIA: provider-2/llama-3.1-nemotron-8b-ultralong-1m-instruct

Stability Fixes:

Gemini 3.0 & Gemini 2.5 Pro: Fixed critical issues causing model failures
- provider-2/gemini-3-pro-preview - Now fully operational
- provider-2/gemini-2.5-pro - Stability restored
- Thinking Tokens (Beta): Added beta support for thinking tokens
  - Tokens may occasionally not display during generation
  - Model responses remain fully functional
  - Complete stability will be addressed in the next update
Image Generation Timeout: Increased from 60 seconds to 600 seconds, resolving stuck requests and timeout failures

Provider 3

Models Added:

Pro Tier:
- provider-3/gpt-5.1
  - OpenAI's flagship model with 400K context window
  - Advanced reasoning, vision, and function calling support
- provider-3/gpt-5.1-chat
  - Low-latency instant variant with 272K context window
  - Optimized for rapid conversational responses
  - Full vision and function calling support
Ultra Tier Exclusive:
- provider-3/grok-4-fast
  - xAI's cost-efficient reasoning model with 2M token context
  - Vision and function calling capabilities
- provider-3/grok-4.1-fast
  - High-throughput model with 2M token context
  - Optimized for parallel processing workloads

Models Removed:

provider-3/gpt-4o-mini-tts - OpenAI Text-to-Speech
provider-3/tts-1 - OpenAI Text-to-Speech
provider-3/qwen-3-235b-a22b-thinking-2507 - Qwen Thinking variant

Provider 4

Models Added:

provider-4/gemini-2.5-flash-image
- Stable release (previously preview version)
- provider-5/gemini-2.5-flash-image now serves as the primary stable variant
- Rate Limit Adjustment: max_images_per_request reduced to 1 image across all tiers (Basic/Pro/Ultra) for this specific model to ensure stability

Provider 5

Models Added:

provider-5/gpt-5.1-codex-max - Ultra Tier Exclusive
- 400K token context window for massive codebases
- Responses route only (/v1/responses endpoint)
- Features: verbosity parameter, reasoning, function calling, vision
- Performance: 3.0-8.0s latency reflecting increased compute
- Enhanced version of GPT-5.1 Codex with superior code generation capabilities

Feature Enhancement:

Verbosity Parameter Expansion: Now available across all GPT 5 and GPT 5.1 series models
- Supported models: GPT-5, GPT-5 Mini, GPT-5 Nano, GPT-5 Search, GPT-5.1, GPT-5.1 Chat, GPT-5.1 Codex, GPT-5.1 Codex Mini, GPT-5.1 Codex Max
- Enables fine-grained control over response detail levels
- Refer to Parameters documentation for usage details

Stability Fixes:

Image Generation Timeout: Increased from 60 seconds to 600 seconds, resolving stuck requests and timeout failures

Platform Impact

Gemini Reliability: Gemini 3.0 and Gemini 2.5 Pro now fully operational with beta thinking token support
Ultra Tier Expansion: Grok 4 series and GPT-5.1 Codex Max provide cutting-edge reasoning and coding capabilities
Pro Tier Enhancement: GPT-5.1 and GPT-5.1 Chat bring flagship-level capabilities to Pro tier subscribers
Catalog Optimization: Removal of 29 open-source models streamlines Provider 2 for production-focused usage

Important Notes

Thinking Tokens (Beta): Gemini thinking tokens are in beta. Occasional display issues may occur but do not affect model output quality. Full stability coming in the next update.
Provider 4 Rate Limits: Gemini 2.5 Flash Image in Provider 4 is limited to 1 image per request across all tiers for stability.
Model Alternatives: Removed open-source models may be available through other providers. Check the Models page for alternatives.
Verbosity Parameter: Now available on all GPT-5 and GPT-5.1 series models in Provider 5 for enhanced response control.