On the morning of May 14, on the eve of the announcement of AI products at the Google I/O developer Conference, OpenAI came to steal the limelight again.
In the early morning of May 14, Beijing time, OpenAI released GPT-4o, a new generation AI model that can perform real-time audio, visual and text reasoning, in a short video introduction, and will launch a PC desktop version of ChatGPT.
According to reports, the “o” in GPT-4o is the abbreviation of Omni, which means “all-round”. In terms of API usage, the GPT-4o is 50% cheaper and 200% faster than the GPT-4-turbo, which was released last November, and the GPT-4o voice and video input will be introduced in the coming weeks.
In addition, OpenAI also announced that all the capabilities of GPT4-o and ChatGPT Plus members, including vision, networking, memory, code execution, GPT Store, and more, will be available to all users for free!
At the event, Greg Brockman, co-founder and president of OpenAI, also had a five-minute conversation between the old ChatGPT, which only has conversational features, and the new version of ChatGPT, which is based on GPT-4o and has visual capabilities.
In conversations, the new version of ChatGPT can not only understand what is happening in the camera through visual AI capabilities, but also interact with the old version of ChatGPT through voice understanding of the content, for a richer and more interesting interaction. At the same time, it also supports mid-stream interruption and dialogue insertion, and has the ability to remember context.
In response to the announcement, OpenAI CEO Sam Altman tweeted, “The new GPT-4o model is the best OpenAI has ever had, it’s smart, fast, native multimodal, and available to all ChatGPT users. Whether it’s the free version or the paid GPT-4 version.”
“This is important to our mission, and we want to put great AI tools in everyone’s hands.” Sam Altman said.
For the press conference, foreign media previously rumored that OpenAI would launch a new intelligent search capability, but then reversed – saying that “the distribution and release of search products is to arrest internal leaks in the company.”
At this press conference, OpenAI did not bring a new AI search product, but released the upgraded GPT-4o after a dummy shot. But in addition to the performance improvement of the new model, OpenAI’s release, while there are more product performance improvements, there are also some problems.
After the OpeanAI press conference, an industry expert said, “GPT-4o’s multimodal capabilities only look good, in fact, OpenAI has not demonstrated a real breakthrough for visual multimodal features.” In addition, in terms of real-time audio interaction, the current domestic products such as Doubao and Wenxin Word also have similar call functions.