data42 - unlocking value from clinical data
Peter Speyer
Improving health with data ? Head of Data & Analytics @ Novartis Foundation ? Former CDO @ IHME ? Speaker ? Advisor
Four years ago, we created Novartis’ seminal data42 program to deepen our understanding of diseases, medicines, and patients with insights from existing clinical trial data. Since then, we have linked Novartis’ clinical, omics and image data, harmonized data for findability and analysis, added pre-clinical and real world data, provided access at scale, and enabled hundreds of research projects, all while ensuring patient privacy and compliant handling of data. Our users have created new insights on topics ranging from natural history of disease to biomarkers to patient subgroups to polygenic risk scores to external control arms, all with the aim of bringing more medicines to more patients, faster. As I transition from data42 to an exciting new role in the Novartis Foundation, I wanted to share my view on four key success factors for a program which tackles complex questions with large, diverse, and sensitive data.
In data42, we have built an entirely new platform to enable innovative governance, data pipelines, scalability, and flexibility. However, the success factors apply generally to complex, data-driven projects, from comprehensive curation of data in a data lake or data mesh to analytical use cases.
Define your project – and value story – carefully
With many business and scientific questions and a lot of data, as well as much hype around AI, it is tempting to jump to action too quickly. This can turn into a Sisyphean data curation effort or insights that are not actionable, so it is essential to carefully define the problem you are aiming to solve. Make sure you engage the right stakeholders (including users!) in the discussion, from senior leaders to technical experts in the relevant functions. Engage them early and continue engaging them systematically throughout the project.
This is a highly iterative process where additional data, new analytic methods, or cutting-edge tools can enable entirely new solutions. Once you have the project defined, capture your ambitions succinctly. Objectives & Key Results (OKRs) are a very pragmatic approach which helps to be both aspirational and specific on deliverables (see John Doerr’s excellent book “Measure what matters”). OKRs should include key results for value generation, but also for specific steps as you turn data into evidence into value. And remember that you may have limited control over the actual value generation, as your users will use – or even generate – the evidence and turn it into value. As you review your overall progress, you may need to go back to the drawing board, re-invent the project, and pivot.
Side note on our program name: in Douglas Adams’ comedy science fiction franchise “The Hitchhiker’s Guide to the Galaxy”, supercomputer Deep Thought is tasked with answering the "ultimate question of life, the universe, and everything". The answer is perplexing: 42!? As the computer wisely adds: if you don’t understand the answer, you probably didn’t understand the question in the first place. This holds true for use cases requiring a clear scientific question, as well as for the project overall which needs a clear project definition.
Understand your audience and obsess about your users
The use of your data will be multi-faceted. Users may range from scientists formulating hypotheses to data scientists implementing analytics to senior executives using evidence for decision making. Therefore, it is essential to identify your key user groups and avoid building everything for all people. Prioritize users based on the likelihood of them adopting your platform and leveraging your data and evidence to generate value. Understanding how these (key) users work, use data, and collaborate with others is essential for a successful project.
In data42, we created a dedicated Customer Success team. The team ensured user centricity and user research, engaged and onboarded users across the organization, managed feedback in the program, and helped users accomplish their goals on our platform, from trainings to problem solving to data and analytic services to community engagement.?
领英推荐
Be agile as you curate data and develop data, tools, and platform
Innovating with personal health data isn’t easy. Data and analytic approaches can be complex, the scale of data can be staggering, and trying to implement a final, fully scaled solution immediately is likely to fail. Implementing data curation and products in an incremental, iterative fashion is more likely to help you develop the best solution.
As you build your project in an agile fashion, make sure you communicate your successes and value stories systematically to users and other stakeholders. This will help get your project team(s) and users to further push the envelope and get additional funding to take your project to the next level.?
Collaborate, collaborate, collaborate
Unlocking value from complex data is a team sport.
It was terrific to be part of the leadership team bringing an ambitious idea to fruition as an internal startup at Novartis. In entrepreneurial fashion, we pivoted several times as we balanced democratization of health data across Novartis (going broad) with comprehensively enabling complex research agendas in selected disease areas (going deep). I took on different roles as the program expanded and matured.?We collaborated with world-class experts from science to data to analytics to technology across Novartis and beyond. But most importantly, we tackled these challenges together with a highly skilled, cross-functional and close-knit team passionate about improving health by accelerating pharma R&D with data and analytics.
In that spirit, let’s collaborate on improving health with data. I spent the last four years unlocking value from clinical trial (and related) data as Head of Products and Customer Success with our innovative data42 internal startup team at Novartis. I spent seven years enabling insights with health data from over 200 countries working as Chief Data & Technology Officer with the incredible team at IHME on the Global Burden of Disease study. And I am always happy to share insights and learn from you. I look forward to your comments, and feel free to ping me for a virtual or in-person discussion.
Global Head of IT
4 个月Great article, thanks for sharing! In our search for the 'Great Machine,' data science is driving pharma research with unprecedented ambition and speed towards faster new treatments.
Data Strategy Leader | Bioinformatics Expert | Driving Insights in Plant, Microbe & Infectious Disease Research | 20+ Years Experience
4 个月Thanks Peter for this very insightful article! We will be taking these lessons to heart as we build out our data strategy.
Data Scientist and Analyst | BI architect | Customer insight researcher | Power BI developer | Computational psychometrics expert | 10+ years AI Implementation | PhD
8 个月data42 is an impressive project and I was glad to hear about it. It’s great that you are overcoming both organizational and legal reasons as well the inconsistency of the RWD. I once tried to put together a data lake for a large medical organization, but everything was drowned in bureaucracy. Also it was difficult to trust medical records because sometimes you have to “read between the lines.” It seems to me that today, when LLM extracts structured data from the RWD, we can use this data with great confidence. And the models will be more robust. Good luck to data42!
Strategic Account Executive EMEA @ Dotmatics | Lab Informatics Specialist
1 年Very insightful Peter! It's been a while since I caught up on Data42 and previously had some conversations about the initiative with several teams at Novartis back in 2020. You mention identifying 'data gaps' as a key piece of the puzzle. We found that only 12% of R&D scientist observations get captured straight in ELN - with 88% of info going on paper or staying in someone's head. Obviously, that can lead to a huge data gap which could make any subsequent AI/ML/Advanced Analytics pretty sub-optimal. Do you see this challenge at Novartis? How do you address this? Happy to connect and share some thoughts and ideas if you are interested.