Thinking about deploying AI in your R&D? Here are 10 tips from the experts
Elsevier for Life Sciences
Innovate and prioritize faster and safer with actionable data and AI
The rise of AI, and, more recently, large language models (LLMs), has scientists and researchers working in R&D asking a lot of questions about how these technologies might help accelerate innovation.
A recent webinar panel on the perils, pitfalls and promise of generative AI for R&D (the first of a four-part series called “AI in innovation: Unlocking R&D with data-driven AI”) was moderated by Elsevier’s Commercial Director for Corporate Markets Zen Jelenje and included Elsevier’s VP of Data Science Life Sciences Mark Sheehan, as well as two experts from subsidiary SciBite: Director of Data Science & Professional Services Joe Mullen and Head of Ontologies Jane Lomax.?
With Elsevier’s history of providing enriched and curated scientific data in AI-driven solutions such as Reaxys and Embase, the discussion focused on the questions our scientists, data scientists and computational chemists get from customers around AI and LLMs.
Here are our top 10 takeaways from the panel:
#1: Get your data in order
It’s easy to get distracted by all the noise and hype around LLMs, particularly ChapGPT. But to take advantage of any AI technology you need to start with your data. “Your data need to be well organized, well-structured and FAIR – meeting the principles of Findability, Accessibility, Interoperability and Reusability,” says Joe. “Only then will you be ready and flexible enough to quickly and seamlessly latch onto the best solution for the problem you want to solve.”
#2: Don’t rush to a “solution” – start by asking, “What’s the specific problem I want to solve?”
“You've got to remain focused on identifying what the problems are, and only then look at the ever-evolving solutions to solve those problems,” says Joe.?
“Instead of thinking of it as whether to invest in AI,” adds Zen, "you need to ask the question, ‘How does this improve my research?’”.?
#3: Don’t consider LLMs as an all-in solution – especially for Life Sciences
At the end of the day, scientific progress is built on providence, transparency and reproducibility. And LLMs such as ChatGPT are simply not built for that – for now anyway. Currently, much of Elsevier’s work is built up on ontologies. “These use language to create a model of a domain,” says Jane. “It's a codification of what humans understand about a particular domain – facts as we now understand them. And I think that's always going to be something that's necessary and useful.”
“LLMs, on the other hand, are probabilistic models that are really powerful at generating and understanding human language. They’re amazing and we use them internally,” says Jane. But unfortunately, LLMs also hallucinate and the information is not properly sourced. So, in the longer-term, many hope “to have an LLM with an ontology-based factual backbone – and then you’ll have something truly powerful,” says Jane.
“I also think that LLMs can bring value to one of our main aims at SciBite,” says Joe. “And that’s supporting data democratization – improving the access and interpretation of data. But LLMs won’t be able to supply this by themselves due to their limitations.”?
#4: Don’t underestimate scaling
“One piece of advice: don't underestimate the difficulty in being able to scale these types of technologies to production,” says Jane. “When we started with this three years ago, we ended up having to take a step back and first build the infrastructure and invest in the skills. We learned a lot through that process, but it was quite a learning curve. So, if you're investing in this, don't overlook this. Come chat with us.”?
领英推荐
#5: Think operationalization
“New technology brings new holistic cost considerations,” says Joe. “There are costs associated with rolling out some of these larger models: monetary costs, time costs, disk and carbon footprint costs, and so on and so forth.”
#6: Get your hands dirty
“I read a McKinsey report the other day about whether you want to be a taker, a shaper, or a maker in the AI space,” says Mark. “Are you going to wait until it’s fully cooked? Nothing wrong with that. And it can depend on the industry or your company’s appetite for risk and investment.” But for Elsevier, the road was clear: jump in now.?
“It’s important to acknowledge there will be bumps on the road on that digital transformation journey. There will be mistakes and there will be failures. But it's also incredibly rewarding when you get it right. You need to learn from your mistakes, pick yourself up, and move forward.”
Tip #7: Think modularity
“Our enrichment pipelines continue to become more automated and feature more of the latest AI technologies as we iterate,” says Mark. “And certainly, it's not the case that as soon as a new technology comes in, we throw out what we had before. It works well that we have a mix of rule-based technologies and machine learning technologies. And now we're exploring the latest Gen AI technologies. These can all be complementary.”
“We always try to find a way to integrate all these different pipelines, datasets and capabilities into what my team calls a Lego set,” says Mark. “It's a great way to approach things in a modular and flexible way without getting too obsessed about the latest or greatest technologies.”?
Tip #8: Stay on top of what’s happening
It might be simpler to wait for others to fail and then adopt, but you do risk being left behind – and losing any competitive edge. “Around ten years ago, AI was beating humans at Space Invaders. Around five years ago, AI got better at Go. Just a few weeks ago, AI started beating humans in real-time drone racing. AI is evolving at such a pace, you need to keep yourself skilled up and aware of what's going on around you,” says Joe. “And again, this is about getting your hands dirty. Reading a few articles and blogs isn’t enough. But it’s a difficult balance: keeping on top of things without getting sucked in, while just trying to identify those problems you want to solve.”
#9: Keep humans in the loop
Subject Matter Experts (SMEs) remain essential to validate the output of any AI algorithm – and more so when it comes to LLMs. For instance, these SMEs can be deployed as prompt engineers to ask the right questions to the LLMs so the resulting output is easier to validate.??
#10: While waiting on regulatory decisions, aim to be responsible
The regulatory environment is in flux and likely to change fast, but in the meantime: you should aim to be responsible. “Regulations are all about governments coming in saying we need to manage this space because we're concerned about the future. But it could start with responsible AI where the actual practitioners go ‘How can we be responsible and ethical about how we approach this?’. And at Elsevier, we’ve really tried to bake this into our daily work from the start with?our Responsible AI principles.”
For more insights, watch the full webinar.
#GenerativeAI #Innovation #Research