The State of Generative Artificial Intelligence -- Written by AI!
Generative machine learning and AI are advancing at a staggering pace.
What was once a fringe field is now being taken quite seriously by the biggest tech companies and the best researchers.
As a result of this attention, we're continually seeing breakthroughs in areas like image, text, and music generation. And the pace of research and development is still accelerating.
In this post, I won’t dive into the details of any particular recent breakthrough, but rather try to give a bird’s eye view of the state-of-the-art in generative AI, what we know about its future, and what it might mean for society.
So what can the best models actually do? What does state-of-the-art look like across domains?
Text Generation:
Progress has been pretty impressive here. Over the past few years, we have gone from barely comprehensible output, to models that can actually generate coherent, readable paragraphs.
Take a look at OpenAI's GPT-3 to see what’s possible.
Tools are being built with this tech, including HyperWrite.ai, the AI-powered writing assistant we’re building at OthersideAI.
Tools like HyperWrite weren’t possible just a couple of years ago, but times have changed.
In fact, this post was written with HyperWrite.ai!
Other companies use tech like GPT-3 to build products that can write code (DebuildHQ), generate marketing copy (Copy.ai), design apps (TricycleAI), and more.
This is just the beginning.
Image Generation:
Recent advances in technologies like GANs and Transformer networks have led to some impressive results in image generation. DALL-E, from OpenAI, is a great example.
These models are able to create beautiful and realistic images from scratch.
Often, these images are so convincing that they are hard to distinguish from real photos!
Take a look at these examples, generated by DALL-E:
领英推荐
While image generation hasn't seen as much commercial adoption as text generation (these models are still largely inaccessible to developers), over the next couple of years we can expect that to change.
As the technology behind these models improves, and as it becomes faster and easier to run these models, we will see more companies building tools and products that use this type of tech to generate images.
Imagine a world where instead of searching Google for the perfect picture of a sunset, you could just generate it yourself, by describing what you want to see.
Note -- the header image of this post was generated by VQGAN+CLIP, an image generation system.
Music Generation:
Generative models have also made huge strides in music generation. While music generation still lags behind text generation and image generation, we’re seeing some very exciting indicators that the best is yet to come.
We're not at human-level music generation yet, but what OpenAI achieved with Jukebox is quite promising.
Listen to these audio samples to hear what Jukebox can do: jukebox.openai.com
It's not great (yet) but it's a lot better than anything we've seen before.
In just a few years, you can expect to see AI models that can generate music that is indistinguishable from that produced by human musicians.
As this technology advances, we can expect to see research labs combine these modalities for even more powerful results.
Imagine a model that can generate text, images, videos, music, and more -- and combine them in new ways to create compelling, new content.
These models will be able to link between domains, allowing for a new degree of creativity.
Wu Dao 2.0, the largest known neural network, created by the Beijing Academy of Artificial Intelligence, was trained on both images and text, and can generate both from the same network.
We can expect to see more of these multimodal supermodels in the future.
Over the next decade, the state of the art in generative models will continue to improve, and we will see more and more game-changing commercial applications.
We will see AIs that can generate everything from text and images to video and 3D models.
That future is closer than you might think. And it's one that I'm excited to help create.
If you want to see what state-of-the-art text-generating AI can do, give hyperwrite.ai a try!
Don't be impatient with chargebacks, time is money ??
11 个月Matt, thanks for sharing!
Let your slides speak for you, interact for you and present for you
1 年??
Editor-at-Large, Corporate Communications Professional, Journalist, Book Editor
3 年Writing with AI is a little farther along than music, as expected. Some of those tracks gave me a ?? The clip that sounded least like a record played backward (remember those) is the Backstreet Boys one. The Frank Zappa and Bob Marley tracks are just noise ??