?? ChatGPT's 1st birthday ??
(This is an English translation of an article originally written in Dutch at Smals Research. Links in the text may refer to resources in Dutch.)
On Nov. 30, 2022, ChatGPT saw the light of day. The generic chatbot that talks about virtually everything, was an instant hit with a broad audience. Generators for images, such as DALL-E and Stable Diffusion, only added to the hype and by now it is clear: Generative AI is here to stay. At Smals Research we do tech watch, hence we started exploring immediately, and wrote an article about our first findings [in Dutch] merely 10 days after the launch.
1 year later, the impact of ChatGPT can surely be called transformative. The ecosystem around generative AI is booming. What 3 years ago only existed in the realm of unrealistic dreams, suddenly became feasible and could be in production today. Countless start-ups are popping up: the counter on the website There's an AI for that has now passed the 10.000 mark, with March 2023 alone good for 1209 new AI companies. By comparison, for entire 2021 there are only 288 start-ups in that same database.
The impact and speed at which everything evolves also causes nervousness. Education, for example, must hurriedly adapt to a new reality. The average school student now has easy access to technology that can write an entire essay in seconds - and when South Park devotes an episode to it, you know it's a thing. Closer to home, many universities have established guidelines on its use (e.g., Leuven, Ghent, Antwerp). These are generally thoughtfully crafted and well-suited to inspire similar guidelines in companies and governments.
Plenty of experimentation occurs in professional contexts. "No idea, but ask chatGPT" has become an often heard phrase when fresh new input is sought. A poll in Nature demonstrated that quite a few scientists have already explored the technology for their work-related obligations. Academics openly wonder whether having to submit phonebook-size grant proposals still makes sense, if their redaction can be automated. Similar observations are no doubt made in other sectors as well.
Looking back
OpenAI continues to lead the dance, and their own blog gives a good overview of the developments of the past year. A timeline with some key moments:
That last move was not unanimously welcomed: many start-ups in the generative AI ecosystem have built their business around the concept of Retrieval-Augmented Generation (RAG), and OpenAI now directly competes with their custom GPT-powered applications. (Some claim that this would have played a role in the CEO-soap two weeks later, but those rumors have not been proven).
Over the past year, Retrieval-Augmented Generation (RAG), with langchain as the most popular library for developers, has become the default way to hook up Large Language Models to specific, internal or recent information. The idea follows from the fact that the prompt - this is the instruction given to the language model - can now be so long that there is enough spare room to add entire pages of additional information. By enriching the prompt with, for example, the results of a search query or the latest news items, a chatbot can formulate answers based on recent information or content from specific databases, without the underlying language model needing to have been trained on it.
Microsoft wasted no time putting this into practice. With Bing Chat, they launched a new conversational search interface that uses Bing Search results for its answers. The advantage is that source citations or references can be transparently added. However we must remain alert that this still does not guarantee correctness: search results may still be irrelevant, and their summarization may be incorrect or incomplete. The product was a hit and Microsoft is now going all-in: in the meantime, Bing Chat has undergone a rebranding to Microsoft CoPilot, it has been integrated into the Edge browser, and it has become available in Windows 11 and Microsoft 365 (the former Office ). To use this functionality, the user must allow the app to read opened documents or webpage content and share it with the CoPilot service.
Microsoft is also ahead of the pack when it comes to images: The Bing Image Creator offers (for the time being) free access to the DALL-E 3 generator, and its results are seamlessly integrated into the new Microsoft Designer.
领英推荐
Google has had much less success with its competing product Bard. Things went wrong during the introduction when Bard responded with an incorrect fact (a hallucination), and Google's stock price took a hit. Compared to OpenAI and Microsoft, Google seems to be less concerned with integration and user experience, and rather focuses on the theoretical background and deepening of the technological possibilities.
Meta can't stay behind of course, and is pulling the quasi-open-source card with its own Llama language models, emphasis on quasi. They seem to have a strong focus towards individual AI developers, for whom they want to make it easy to reuse or retrain their language models, which e.g. Stanford did with its Alpaca variant. They are also aiming at deployment on average hardware, via the successful llama.cpp library, which allows a model to be quantized: cleverly rounding the parameters of a trained model for a smaller memory footprint, at the cost of a small but acceptable loss of precision. Roughly speaking, a quantization from 32 bits to 8 bits, makes a model of 13 billion parameters require not 52GB but only 13GB of RAM. Thus, it fits fully into the memory of today's graphics cards with 16GB or 24GB vRAM. User-friendly tools to host quantized models on your own computer are GPT4All, MLC or ollama.
Several smaller companies are also still in the race and are developing their own language models that can serve as a backend for chatGPT-like services or RAG applications. Worth mentioning are Anthropic, founded by ex-OpenAI employees, which mainly wants to emphasize transparency and security, with its Claude models. In addition, European Mistral got off to a flying start. With ex-Meta employees at the helm, it uses a truly open source model and is therefore mainly in competition with Meta. For the Dutch-speaking world, there are plans to develop an independent GPT-NL through research organization TNO, which should result in an alternative to reliance on major American players, with a better cultural fit and with a focus on respecting EU laws.
Looking ahead
Many companies have yet to begin the exercise of adapting their own business models to the rise of AI. However, we are still in the early stages of (generative) AI growth, and it is difficult to predict what the next few years will bring. Staying informed of developments is a first step. An excellent newsletter that keeps its finger on the pulse is The Batch from DeepLearning.AI, which provides a concise summary of the most important events in the sector every week. Those who want to broaden and deepen their knowledge will also find a solid course offering on the same website, such as this one: Generative AI for Everyone. Microsoft has developed a Generative AI for Beginners course aimed at software engineers. Finland's Elements Of AI has one of the most accessible free courses for a general audience, and there are many others.
Today, business leaders and policymakers are already confronted with all these new developments in the workplace or in administration, and want to formulate a response to them - or at least develop a code of conduct. The policy monitor [in Dutch] of the Flemish Data and Society Knowledge Center collects examples at home and abroad that can serve as inspiration for those who do not want to reinvent the wheel. The British AI Standards Hub collects relevant publications on industry standards related to AI. At a more abstract international level, the OECD, among others, is actively monitoring developments. The rapidly changing landscape certainly does not make legislative work any easier. At European level, the expected AI act is certainly being delayed. This did not prevent Stanford from evaluating the existing major players based on the current draft text.
In the meantime, nothing stops a company or government from getting started and experiment with the technology. It remains important not to rush too quickly and not to throw all common sense overboard. ChatGPT is not a miracle solution. No amount of safety mechanisms can guarantee that a language model won't hallucinate or present completely made-up texts as facts. The training datasets of many language models are well-kept secrets, but we do know that they are so large that meticulous selection and filtering cannot possibly have been done on them. Today's LLMs are black boxes; the origin of a certain word choice in an answer cannot be traced. This makes them unsuitable to indiscriminately build critical applications upon. It is obvious that it would be a bad idea to have medical data processed, without any supervision, by a language model that has been partly trained on randomly picked texts from conspiracy theorists, anti-vaxers, Instagram influencers, homeopaths and other quacks roaming the internet.
Speaking of quackery, doomsayers are now appearing regularly, who, in their bids for attention, are coming up with increasingly grotesque statements, even predicting human extinction by AI. (The reverse also exists: people who believe that utopia is near.) The disproportionality of such statements indicates both a lack of subject matter knowledge and a disconnect with reality. After all, despite all the progress, we are still miles away from the point where a robot can even iron your clothes. Clownish claims do harm: they distract from the necessary discussions about problems we do see emerging today in the real world: abuse such as deepfakes, widening of the digital divide, a lack of options to challenge automated decisions, the use of data without permission or attribution, etc. All serious matters that require our continued attention, and where Europe, to its credit, plays an active and pioneering role. These issues don't necessarily have to be innovation blockers, because even with both feet on the ground there is no shortage of opportunities.
In a recent opinion piece, Bill Gates received quite some support for his vision of the evolution in the near future. Which is that chatbots continue to evolve into "agents", i.e. they are also given (limited) autonomy to take action, supervised if need be. Where Co-Pilots are still part of an application, agents would become more generic, and in the future even accomplish tasks across applications, much like human personal assistants. Indeed, there is still a lot of work to be done before that happens: protocols that allow apps to communicate better with each other, ways to exchange data safely while protecting privacy, ...
Finally, the dependence on massive and non-transparent models (GPT-3, GPT-4) on an external cloud service remains a difficult pill to swallow when internal or sensitive data might be processed. It is unpredictable what data a typical user might provide to a chatbot, and with CoPilot-like plugins it is often difficult to assess what data the plugin collects and sends away to the cloud behind the scenes. However, the GDPR does impose strict and concrete requirements. Promises in terms and conditions or even contractual agreements that data received will not be stored or reused, are not for everyone good enough to suddenly have blind faith.
The obvious alternative is to deploy smaller models locally. However, the quality of their output is correspondingly lower, leading to disappointment if high expectations created by chatGPT are the reference point. Smaller models lack the smooth multilingualism of ChatGPT, and work with much more compact prompts, which limits the development of RAG apps on top. There are continued efforts in various directions for ways to close that gap. For example, further innovation happens regarding ways to efficiently fine-tune (specialize) smaller models for a specific task. The concept of distillation - compressing a model - also looks promising. There are indications that better results can be achieved by training on little but high-quality and correct data, instead of on a lots of messy data containing errors. In developing each application, it remains a challenge to find good balances in terms of model choice, prompt engineering, fine-tuning, and RAG.
The playing field is still full of possibilities and there is plenty of room for innovation. We can certainly expect further improvements in the short term. Here's to undoubtedly another very interesting year ahead!