AI image generators – creating (mostly) realistic pictures from text
Is there anything artificial intelligence can’t do? It seems like every month, there’s a mesmerising new technological breakthrough. And the latest AI trend creates a painting or an image from a sentence you type.
Yes, this is the one you’ve been waiting for. Text-to-image uses AI to understand your words and convert them to a unique image each time. Like magic. This can be used to generate art, or for general silliness. Although to be fair, with many apps/generators you can’t expect the quality to be too photorealistic (yet).
Another thing to keep in mind is that, although text-to-image generators certainly have fantastic creative potential, the output of these systems can be racist, sexist, or toxic in some other way.
Why is that we hear you ask? A lot of it is due to how these generators are currently programmed.
These systems need a hell of a lot of data, and most researchers – even those working for tech giants like Google – have decided that it’s too time consuming to comprehensively filter this input. So, instead the AI scrapes huge quantities of data from the web, and as a consequence learns to use all the hateful content you’d expect to find online.
There’s also an argument that the current results are being cherry-picked to show the systems in the best light. Both Google & OpenAI have declined to release code or a public demo that would allow researchers to put them through their paces. Part of the reason for this is a fear that AI could be used to create misleading images, or simply that it could generate harmful results.
There’s a bunch of options to choose from but in this article we focus on the 3 most noteworthy ones (in our opinion), namely: Google’s Imagen, TikTok’s AI Greenscreen and OpenAI’s DALL-E 2. And even though two of them are not available to the public today, we’re sure they will be as soon as it is safe to release them into the wild.
Google’s Imagen
Earlier this year Google announced its Imagen AI tool that takes simple sentences and turns them into photorealistic graphic designs. It’s scary-cool how good the pictures look and you can actually try a small (pre-defined) demo on their?website. Needless to say that the internet is already in favour of Imagens end results. It also has super-resolution model that allows the generation of 1024×1024 images (for reference, DALL-E 2 produces images with a 256×256 resolution).
According to Google AI lead Jeff Dean, AI systems like Imagen “can unlock joint human/computer creativity”. Worrying? Maybe. Interesting? Definitely.
TikTok’s AI Greenscreen
The video platform has quietly launched its own AI image generator, an effect called “AI Greenscreen”, that allows users to type in a text prompt that the software will then generate as a rather abstract and swirling image. This image can then be used as the background to a video.
领英推荐
Perhaps intentionally, the output of TikTok’s system is pretty basic compared to that of Imagen or DALL-E 2. Regardless, the appearance of AI generated images in the world’s hottest app serves as proof that text-to-image AI systems are definitely gaining in terms of both ability and popularity.
DALL-E 2
Until recently DALL-E, a program created in January 2021 by commercial AI lab?OpenAI?(and named after the artist Salvador Dalí and the robot WALL-E), was the leader in the field. Its latest version – DALL-E 2 was only released back in April but is already rivalled by Google’s Imagen (launched in May).
However, one of the features that set DALL-E 2 apart from other solutions is “inpainting”. It applies DALL-E’s text-to-image capabilities on a more granular level and allows users to start with an existing picture, select an area, and tell the system to edit it.
Another cool feature is “variations”, which is basically an image search tool for pictures that don’t exist. Users can upload a starting image and then create a range of variations similar to it. They can also blend two images, generating pictures that have elements of both. What a time to be alive…
What will the future hold?
It’s been less than two years since the release of DALL-E and the tech is already in the hands of millions via TikTok. So, given the potential of these systems for both harm and good, things are probably going to get stranger from here on in. Only time will tell.
What we do know is that, since launching Imagen, Google has created a new benchmark called DrawBench (and obviously claimed that its own tool produces consistently better images than DALL-E 2, based on said benchmark).
And although it isn’t a particularly complex metric it is likely going to be used by the industry in the future, when comparing the accuracy & quality of AI generated images.
Some also predict that human illustrators and stock photographers will soon be out of a job because of AI generated imagery. In reality though, today’s limitations with these AI solutions mean it will probably be a while before they can be used by the general public.
In the meantime, if you need the help of our human illustrators let’s chat!