Gemini 3 Stability Fix, Grok 4 Series & Platform Refinements
Critical stability improvements for Gemini 3.0 and Gemini 2.5 Pro restore full functionality with beta thinking token support. The Grok 4 series debuts on Ultra tier alongside GPT-5.1 expansion across Pro and Ultra tiers. This update also performs strategic catalog optimization with 29 open-source model removals from Provider 2 and resolves image generation timeout issues.
Provider 2
Model Additions:
provider-2/gemini-2.5-flash-lite- Efficient lightweight variant for high-throughput applications
- Optimized for speed and latency-sensitive workloads
Open Source Image Models Removed (15 models):
provider-2/dreamlike-photoreal-2.0- Dreamlike Artprovider-2/animagine-xl- Cagliostro Research Labprovider-2/animagine-xl-2.0- Cagliostro Research Labprovider-2/juggernaut-x-hyper- RunDiffusionprovider-2/sdxl-flash- Stability AI SD-basedprovider-2/sdxxxl- Stability AI SD-basedprovider-2/playground-v2.5- Playground AIprovider-2/dreamshaper- Lykonprovider-2/stable-diffusion-v1-5- Stability AIprovider-2/opencole-sdxl- OpenColeprovider-2/babes-by-stable-yogi-xl- Stable Yogiprovider-2/realism-by-stable-yogi- Stable Yogiprovider-2/realism-illustrious- XpucTprovider-2/proteus-v0.2- DataAutomationprovider-2/nsfw-gen- Corcelioprovider-2/kivotos-xl- Kivotosprovider-2/fluently-xl- Fluently AIprovider-2/realvis-xl- SG161222
Open Source Chat Models Removed (14 models):
- Meta:
provider-2/llama-3.2-11b-vision-instruct - Microsoft:
provider-2/phi-3-medium-128k-instruct,provider-2/dialogpt-medium,provider-2/dialogpt-large - Mistral:
provider-2/codestral-2508,provider-2/mistral-medium-2505,provider-2/mistral-nemo-instruct-2407 - DeepSeek:
provider-2/deepseek-r1-distill-qwen-1.5b,provider-2/deepseek-math-7b-instruct - Google:
provider-2/gemma-3-4b-it - Qwen:
provider-2/qwen2.5-coder-14b-instruct,provider-2/qwen3-4b-instruct-2507,provider-2/qwen3-1.7b - NVIDIA:
provider-2/llama-3.1-nemotron-8b-ultralong-1m-instruct
Stability Fixes:
-
Gemini 3.0 & Gemini 2.5 Pro: Fixed critical issues causing model failures
provider-2/gemini-3-pro-preview- Now fully operationalprovider-2/gemini-2.5-pro- Stability restored- Thinking Tokens (Beta): Added beta support for thinking tokens
- Tokens may occasionally not display during generation
- Model responses remain fully functional
- Complete stability will be addressed in the next update
-
Image Generation Timeout: Increased from 60 seconds to 600 seconds, resolving stuck requests and timeout failures
Provider 3
Models Added:
-
Pro Tier:
provider-3/gpt-5.1- OpenAI's flagship model with 400K context window
- Advanced reasoning, vision, and function calling support
provider-3/gpt-5.1-chat- Low-latency instant variant with 272K context window
- Optimized for rapid conversational responses
- Full vision and function calling support
-
Ultra Tier Exclusive:
provider-3/grok-4-fast- xAI's cost-efficient reasoning model with 2M token context
- Vision and function calling capabilities
provider-3/grok-4.1-fast- High-throughput model with 2M token context
- Optimized for parallel processing workloads
Models Removed:
provider-3/gpt-4o-mini-tts- OpenAI Text-to-Speechprovider-3/tts-1- OpenAI Text-to-Speechprovider-3/qwen-3-235b-a22b-thinking-2507- Qwen Thinking variant
Provider 4
Models Added:
provider-4/gemini-2.5-flash-image- Stable release (previously preview version)
provider-5/gemini-2.5-flash-imagenow serves as the primary stable variant- Rate Limit Adjustment: max_images_per_request reduced to 1 image across all tiers (Basic/Pro/Ultra) for this specific model to ensure stability
Provider 5
Models Added:
provider-5/gpt-5.1-codex-max- Ultra Tier Exclusive- 400K token context window for massive codebases
- Responses route only (
/v1/responsesendpoint) - Features: verbosity parameter, reasoning, function calling, vision
- Performance: 3.0-8.0s latency reflecting increased compute
- Enhanced version of GPT-5.1 Codex with superior code generation capabilities
Feature Enhancement:
- Verbosity Parameter Expansion: Now available across all GPT 5 and GPT 5.1 series models
- Supported models: GPT-5, GPT-5 Mini, GPT-5 Nano, GPT-5 Search, GPT-5.1, GPT-5.1 Chat, GPT-5.1 Codex, GPT-5.1 Codex Mini, GPT-5.1 Codex Max
- Enables fine-grained control over response detail levels
- Refer to Parameters documentation for usage details
Stability Fixes:
- Image Generation Timeout: Increased from 60 seconds to 600 seconds, resolving stuck requests and timeout failures
Platform Impact
- Gemini Reliability: Gemini 3.0 and Gemini 2.5 Pro now fully operational with beta thinking token support
- Ultra Tier Expansion: Grok 4 series and GPT-5.1 Codex Max provide cutting-edge reasoning and coding capabilities
- Pro Tier Enhancement: GPT-5.1 and GPT-5.1 Chat bring flagship-level capabilities to Pro tier subscribers
- Catalog Optimization: Removal of 29 open-source models streamlines Provider 2 for production-focused usage
Important Notes
- Thinking Tokens (Beta): Gemini thinking tokens are in beta. Occasional display issues may occur but do not affect model output quality. Full stability coming in the next update.
- Provider 4 Rate Limits: Gemini 2.5 Flash Image in Provider 4 is limited to 1 image per request across all tiers for stability.
- Model Alternatives: Removed open-source models may be available through other providers. Check the Models page for alternatives.
- Verbosity Parameter: Now available on all GPT-5 and GPT-5.1 series models in Provider 5 for enhanced response control.