On the generative wave (Part 1)
This is an excerpt of a longer essay I published earlier today for members of Exponential View.
There has been an explosion of services in the field of generative AI. These systems, typically using large language models at their core, are expensive to build and train but much cheaper to operate.?
I want to historicise this trend.?Back in July 2020, I wrote that large language models—I refer to them as transformers back then, like GPT-3–are “capable of synthesising the information and presenting it in a near usable form.” They represented an improvement over previous knowledge technologies, because they would present synthesised information rather than disparate search queries.?
In my simplistic framing, transformers were about synthesis. I studiously avoided defining what I meant by synthesis, but 30 months later, it’s time for me to refine that model. I reckoned that I was trying to suggest “synthesis” meant responses to a query that could be drawn from many different sources, as we were seeing with GPT-3 over text. But what I missed was the power of what these large models could do. I didn’t pick up how quickly they would become multimodal, across text, images and audio; how they might be capable of?de novo?synthesis; how verticalisation would make them more powerful within specific domains.?
Let’s go over these.
Search
I’ve been playing around with two “search” style services: Metaphor, a general search, and Elicit, for academic research. Metaphor is a bit weird. I haven’t been able to write good queries for it, but I have found it helpful in surfacing useful results even in topics that I know something about. (See this search on “technology transitions”. Log in required.)
Elicit is really impressive. It searches academic papers, providing summary abstracts as well as structured analyses of papers. For example, it tries to identify the outcomes analysed in the paper or the conflicts of interest of the authors, as well as easily tracks citations. (See a similar search on “technology transitions”. Log in required.)
But I have nerdy research needs for my work.?
For my common everyday search terms, like “helium network stats” , “water flosser reviews”, “best alternative to nest”, “put a VPN on a separate SSID at home”, “canada population pyramid”, Google still works pretty well. I haven’t quite been able to figure out how to use Metaphor to replace my Google searching.?
领英推荐
It feels a bit like the Dvorak keyboard. Better than QWERTY but QWERTY had the lock-in and we use QWERT today. Metaphor?may?be better the Google but I can’t yet grok it.
My sense is that Elicit’s focus on a use case makes more sense.?
Cross-domain
While GPT-3 showed text generation capabilities, we’re still getting used to cross-modal tools.?
Text-to-image is now commonplace. But Google and others have already shown off text-to-video:?type in a prompt and get a short movie.
Generative approaches are now finding their way into molecular design. Researchers at Illinois University have prototyped a system to translate between?molecular structure and natural language. One could imagine a system being able to generate molecules that match really specific requirements. (“Give me a molecule that is translucent in its solid form and smells of mint.”)
A Singaporean?research group?has demonstrated decent results for “thought-to-image”, skipping the pesky “text” stage.?Outputs are better quality than state-of-the-art text-to-image from six years ago.?
Large-scale models are being used for?generating music.?This may be of particular interest for musicians who today layer loops, beats, and fragments together for their final compositions.?
This is an excerpt from On the generative wave (Part 1) essay I published for members of Exponential View earlier today. Read the full essay here.
Scot, Dad, Statistical Modeler, Marxist Economist, Global Marketer
2 年Azeem Azhar timely recognition of the speed of development and adoption of generative AI. This prompts small and large questions. The small question is what this will do to the creative communication and advertising industries, already transformed (not in a good way) by (quantitative) data and algorithms. The large question is what will happen when generative Ai is integrated in global and national news and media organizations. This integration is technically feasible. Who will regulate it?
Transform your property into a valuable asset
2 年Thanks Azeem. Watching this space with interest it will affect the AEC industry - and where we all live
Freelance News Editor, Writer, and Researcher | Privacy, Emerging Tech and Regulation Reporter
2 年This is great. It’s a great example of explainable AI. Thank you. I look forward to reading your full essay. Thank you for introducing the two search engines you mention here, to me, too. This essay is a public service. Question: What is thought-to-image generative ai and how does it receive input data from thoughts alone? And you are saying this is superior to text-to-imagage generative AI? Unless I am missing something — is this a case of “burying the lead”? To a lay reader, this seems like the most interesting and sci-fi-like piece of news in this great essay.