TOP FIVE DATA SCIENCE AND GENERATIVE AI TRENDS FOR 2024
Syed Shariq Muhammad
Technology Evangelist | Enterprise Client Executive Management| Growth Strategist| Solution Advisor
In 2023, generative AI (gen AI) entered the tech and cultural spotlight. Though the large language models (LLMs) that serve as the foundation for gen AI tools have been developing for years, chatbots like OpenAI’s ChatGPT, Google Bard, and Anthropic’s Claude emerged into public consciousness seemingly overnight. So it’s no surprise that enterprises immediately adjusted course to capitalize on the opportunity. ChatGPT debuted as a public beta in December 2022. By February 2023, it had attracted more than 100 million users—a record-setting adoption rate for a digital service.
Investment in gen AI tools is increasing. For proof, look no further than the market capitalization of Nvidia. In May 2023, the maker of AI-ready “Superchips” soared past $1 trillion. The AI Index 2023 Annual Report by Stanford University estimated that total mergers and acquisitions, minority stake and private investments, and public offerings in artificial intelligence in 2022 were $189.6 billion. 2023 numbers are still being calculated. Along with these milestones came the increasing recognition that data is the fuel that makes generative AI work—and that comprehensive data governance and efficient data management can determine the success or failure of an enterprise AI project.
Over the next 12 months, the Industry will likely see an even broader adoption of such tools alongside more substantial investment, as innovations continue to emerge at breakneck speed. At the same time, organizations will take concrete steps to simplify their data operations, break down organizational silos, streamline governance policies, and develop smaller and more focused LLM applications. Our research leads us to highlight five emerging trends that organizations can use to inform their data strategies and accelerate their AI efforts:
1. Organizations are prioritizing getting their data housed in order
2. Vector and feature stores will begin to live side-by-side to manage AI data
3. General-purpose chatbots will be supplanted by task-tuned counterparts
?4. A new breed of AI apps will emerge from LLMs
5. Open source will continue to be a catalyst for innovation
TREND 1: ORGANIZATIONS ARE PRIORITIZING GETTING THEIR DATA HOUSE IN ORDER
That will likely change in 2024 as data volume continues to grow at an exponential rate. With unstructured data expected to account for more than 80% of that volume, enterprises will be hungry for efficient ways to consolidate disparate data types in a centralized repository that can be shared across their organization, while also allowing them to implement uniform data governance policies.
In 2024 we expect to see more enterprises moving to cloud-based data lakes, allowing them to store data more cost-effectively and to access on-demand GPU compute infrastructure for training AI models
TREND 2: VECTOR AND FEATURE STORES WILL BEGIN TO LIVE SIDE-BY-SIDE TO MANAGE AI DATA
Consolidating data in the cloud is both essential for a uniform data governance strategy and key for preparing data to train machine learning (ML) models. And while training AI models is invariably a time and resource-intensive endeavor, recent innovations have helped streamline the process.
Vector databases help with tasks such as helping AI-powered chatbots understand the relationship between words in a sentence, enabling e-commerce sites to suggest related products based on previous purchases, and allowing streaming sites to recommend new shows to watch based on similar ones. General-purpose chatbots will be supplanted by task-tuned counterparts, and allow streaming sites to recommend new shows to watch based on similar activity. Vector-based search also allows LLMs to respond much faster to queries, requiring fewer computing resources and making them more cost-effective
领英推荐
TREND 3: GENERAL-PURPOSE CHATBOTS WILL YIELD THE WAY TO MORE TASK-TUNED LANGUAGE MODELS
The stunning adoption rate for OpenAI’s ChatGPT and the rapid introduction of other LLMs and supported chatbots, such as Microsoft Copilot, Google Gemini (formerly known as Bard) and Anthropic’s Claude suggested a future where any question a user might ask could be answered.
Google Trends data tells the story. Interest in LLM apps specific to healthcare, finance and manufacturing rose drastically, based on the average daily Google searches increasing by 160%-614% from November 2022 to November 2023. The biggest areas of interest were use of AI in government (daily searches for “AI in government” up 480%) and advertising (daily searches for “AI in advertising” increased by 614%).
TREND 4: A NEW BREED OF AI APPS WILL EMERGE FROM LLMS
Nearly 18,000 developers worked on close to 30,000 apps (this includes apps that are still in development
Insurance firms can feed claim documents such as car service reports into custom LLMs, and then use the data to identify incidents of potential fraud, measure trends in auto maintenance and rate the performance of their repair service partners.
? Financial advisors can obtain next-best-action insights from data derived from LLMs trained on each customer’s market and portfolio data. They’ll also be able to summarize news and current events in ways that may impact individual investment strategies.
? Retailers can employ gen AI apps to analyze customer sentiment and make data-driven decisions around marketing campaigns. These apps will help them quickly determine which products receive the most positive or negative feedback, factor in information such as geographical location and seasonal weather patterns, and use those insights to personalize the online shopping experience.
TREND 5: OPEN-SOURCE TOOLS WILL CONTINUE TO DRIVE INNOVATION
It’s simple logic. Open-source tools have lots of room for customization, and they give organizations more control over data, especially when it comes to LLMs. Because they’re not locked into a single framework or within proprietary algorithms, they’re far more flexible and can work seamlessly with a wide range of tools. And because they are backed by a strong community of users and developers, open-source tools steadily improve.
?Over the coming year, we expect to see a huge increase in the breadth and depth of open-source LLMs coming online. Established players like Llama 2, Mistral and Falcon are likely to play a significant role in the creation of generative AI apps, as enterprises move away from public-facing models like OpenAI’s GPT models and develop their own, more industry or company-centric apps. We’re also likely to see more open-source tools like learned rank augmentation (LoRA), which are much smaller machine-learning models that can be layered on top of larger LLMs, allowing users to fine-tune them without having to retrain the entire model. This will allow for greater efficiency and enable companies without sophisticated data science capabilities to more quickly adopt LLMs for use in their businesses. Do explore IBM Links to learn more to build the uses cases for your enterprise.