OpenAI Unveils GPT-4o (Omni) – An AI Model For Everyone

What Makes This Model Stand Out?
• OpenAI claims that this new model is 2x faster and 50% cheaper to use, which is great news for API developers
• It can respond to audio inputs in as little as 232 milliseconds, which is similar to a human response time
• GPT-4o also offers better multilingual abilities when utilizing its audio and video capabilities
• It is free for everyone and does not require a paid subscription, like with GPT-4

Realtime Conversations

• Live demos showed how GPT-4o can converse naturally with an individual
• It can detect a person’s tone and adjust its speaking style accordingly
• It can be interrupted at any given time, just like in a real conversation
• It can even adjust how it speaks, according to the situation at hand

Better Vision Capabilities
• You can ask GPT-4o questions about photos and screenshots, similar to the Meta Smart Glasses
• It can even explain a block of code or translate text just by looking at it, creating a lot of potential use cases
• GPT4o’s vision capabilities have been improved for 50 different languages, allowing for widespread use
• It can recognize text in complex fonts even better, which is great for OCR

Makes Readable Text In AI Art
• GPT-4o can finally make legible text in images generated via AI, presumably a version of Dall-E
• It can also arrange text in creative ways and according to the picture being created
• GPT-4o is also capable of emulating handwriting. The right prompt can create images indistinguishable from what a real person would write

Availability & Limitations
• GPT-4o will be available to Plus and Team users in the start but access will gradually roll out to users of the free tier
• Users of the plus and free tiers will get five times the amount of usage, ensuring minimal slowdowns
• The voice feature, which will be released in June, will initially be available to Plus users in an early alpha state


