ChatGPT’s image-generation feature gets an upgrade

1 2 minutes read

During a live stream on Tuesday, OpenAi CEO Sam Altman announced the first major upgrade to the possibilities of Chatgpt’s image generation in more than a year.

Chatgpt can now use the company’s GPT-4O model to create and change images and photos. GPT-4O has long supported the AI-driven chatbot platform, but so far the model has only been able to generate and edit text-no images.

Altman said that GPT-4O Native Image Generation is live today in Chatgpt and Sora, the AI-Video generation product from OpenAi, for subscribers of the $ 200-per-month Pro plan of the company. OpenAi says that the function will soon be rolled out to Plus and free users of Chatgpt, as well as developers who use the API service of the company.

GPT-4O with image output “thinks” a little longer than the image generation model that it effectively replaces, Dall-E 3, to make what OpenAI describes as more accurate and detailed images. GPT-4O can edit existing images, including images with people in it or “Pack” details such as foreground and background objects.

To provide the new image function, OpenAI told the Wall Street Journal It has trained GPT-4O on ‘publicly available data’, as well as his own data from its partnerships with companies such as Shutterstock.

Many generative AI suppliers see training data as a competitive advantage, so that they save it and all the information that is related to the chest. But data on training data is also a potential source of IP-related lawsuits, another obstacle for companies to reveal much.

“We respect the rights of the artists in terms of how we do the output, and we have policy that prevents us from generating images that immediately mimic the work of living artists,” said Brad Lightcap, Chief Operating Officer of OpenAi, in a statement to the Journal.

OpenAI offers an opt-out form that makes makers ask that their works are removed from the training datas sets. The company also says that it respects requests not to allow its web scraping bots to collect training data, including images, from websites.

The improved function generation function of chatgpt follows the heels of Google’s experimental native image output for Gemini 2.0 Flash, one of the company’s flagship models. The powerful function went viral on social media – but not necessary for the best reasons. The image component of Gemini 2.0 Flash turned out to have few guardrails, so that people could remove water brands and make images with copyright characters.

This article was updated at 12.00 with the OpenAi statement on the Wall Street Journal on the training details of GPT-4O.

Source link