v16.1.0

Audio & Image Generation Expansion

Sree
SreeAuthor

This significant update expands our creative and multimodal capabilities with 10 new image generation models and comprehensive audio processing features, alongside powerful web search integration for enhanced research capabilities.

Provider 4 Image Generation Models

5

Provider 5 Image Generation Models

5

Web Search Integration Models

6

Audio Transcription Models

4

Text-to-Speech Models

4

Multimodal Audio Chat Models

9

Model Capabilities

4
  • Image Generation: 10 new models across Provider 4 and Provider 5 from Black Forest Labs, Leonardo AI, Stability AI, HiDream, and Qwen.
  • Web Search Integration: Real-time information access through GPT models with vision and function calling support.
  • Audio Transcription: Industry-leading speech-to-text with Whisper and GPT-4o models, including speaker diarization capabilities.
  • Audio Synthesis: High-quality text-to-speech with standard and HD variants, plus native audio processing in chat conversations.

Platform Enhancements

3
  • Provider 4 Expansion: Enhanced infrastructure with 5 new image generation models from leading AI providers.
  • Provider 5 Audio Suite: Comprehensive audio processing capabilities including transcription, synthesis, and multimodal chat.

Important Notes

2
  • Tier Access: Models are automatically available in all higher tiers. Free tier includes 4 image models, Basic tier adds 4 more, Pro tier provides complete access.
  • Audio Features: Most audio capabilities require Basic tier minimum, with advanced features such as HD synthesis and diarization available on Pro tier.