How I Used Python and OpenAlex to Track the Evolution of Research Database Usage: Key Insights for Researchers and Decision-Makers.

How I Used Python and OpenAlex to Track the Evolution of Research Database Usage: Key Insights for Researchers and Decision-Makers.

In today's world, the amount of research output being generated is mind-blowing. With every year, more studies, patents, and academic works flood the landscape, making it harder and harder to track the impact of research. But there’s something crucial happening behind the scenes—a shift in how we use research databases.

For years, giants like Web of Science and Scopus have dominated the world of bibliometrics and scientometrics, helping researchers track citations and measure impact. But things are changing. New players are entering the field, bringing with them innovative tools and open-access solutions that are shaking things up. I wanted to understand exactly how database usage has evolved over the years, so I did what any data nerd would do—I dug in!


The Project: Leveraging OpenAlex and Python

Curious about how the usage of research databases has evolved over the past 25 years, I turned to OpenAlex, an open-access platform that has quickly become a treasure trove for researchers. Using OpenAlex's API, I downloaded data on publications related to bibliometric and scientometric studies, focusing specifically on which databases were mentioned most frequently in the research abstracts.

But the raw data alone isn’t enough. So, I wrote a Python script to preprocess, clean, and extract the databases from the abstracts. My code also generates some awesome visualizations to show how the use of these databases has evolved from 1995 to 2024, broken down into 5-year increments. (Spoiler alert: The trends are fascinating!)

You can find all the code and tools I used for this analysis on my GitHub. Feel free to explore, modify, or run your own analysis!


The Visuals: How Database Usage Has Evolved Over Time

After crunching the data, I created visualizations that track the mentions of different research databases from 1995 to 2024. Take a look at how the landscape has shifted:

by Mbogning (CognosX)

The results are pretty clear: Web of Science and Scopus have been the go-to databases for decades. However, we’re seeing some exciting trends that indicate new players are starting to rise. Let me walk you through the most important insights from the data.


Key Insights: The Changing Landscape of Research Database Usage

1. Web of Science and Scopus: The Longstanding Leaders

If there’s one thing that remains constant, it’s that Web of Science and Scopus have been the undisputed leaders in the world of research databases. These two databases consistently show up in the majority of studies and remain critical tools for tracking citations, impact, and research collaborations.

  • Web of Science was mentioned 19,656 times across the publications I analyzed.
  • Scopus closely followed with 14,219 mentions.

These platforms offer extensive coverage, making them indispensable for researchers looking to measure academic impact. But while they remain strong, they aren’t the only players in town.

2. The Rise of Alternative and Open-Access Databases

Over the past 10 years, we’ve seen a significant shift towards open-access databases like The Lens and Dimensions. In fact, The Lens has grown from just 2 mentions in the 1990s to over 200 mentions in recent years. It’s a sign that researchers are increasingly turning to free, accessible tools that integrate patents, scholarly works, and other data sources.

Dimensions, another newer player, has also gained traction with 117 mentions. It’s become particularly valuable for researchers looking to explore how research connects to grants and funding trends.

3. Growing Diversity in Database Usage

One of the most interesting insights from the data is how diverse the range of databases has become. Early in the 1990s and early 2000s, only a few databases—like Web of Science, PubMed, and Scopus—dominated the field. But in recent years, we’ve seen a much broader range of tools being used, reflecting the evolving needs of researchers.

Platforms like Google Scholar, Crossref, and the Directory of Open Access Journals (DOAJ) are seeing steady growth, as more researchers prioritize accessibility and open science.

4. PubMed/MEDLINE’s Consistent Role in Biomedical Research

For those in biomedical fields, PubMed/MEDLINE remains a staple. It’s one of the most consistently mentioned databases across all time periods, with 3,064 total mentions. While it may not have grown as rapidly as some other databases, it remains essential for life sciences research.

5. The Future is Open: OpenAlex and Open Science

Although still new, OpenAlex has the potential to shape the future of research database usage. With 34 mentions in its early days, it’s a signal that open science and open-access platforms are on the rise. OpenAlex provides researchers with free access to data on publications, citations, and authors, breaking down barriers to entry that have traditionally limited research to those with access to expensive subscriptions.


Why These Insights Matter: How Researchers and Decision-Makers Can Benefit

So, what does this all mean for researchers, academic institutions, and decision-makers?

  1. For Researchers: Understanding which databases are gaining traction can help you choose the right tools for tracking citations, collaborations, and emerging trends. While Web of Science and Scopus are still essential, exploring newer databases like Dimensions or OpenAlex can provide unique insights, particularly if you’re working with open science or need access to funding data.
  2. For Institutions: If you’re in charge of research strategy or funding decisions, it’s critical to stay ahead of these trends. Newer databases may offer more flexibility and insights beyond traditional citation metrics, especially as open-access resources continue to expand.
  3. For Policymakers: As open science continues to grow, databases like OpenAlex and The Lens could play a huge role in democratizing access to research data. This will help governments and institutions make better-informed decisions that benefit both academia and society as a whole.


Next Steps: What Does the Future Hold for Research Databases?

Looking ahead, I expect to see even more growth in open-access platforms. With the push towards open science, tools like OpenAlex are likely to become even more prominent. Meanwhile, established players like Web of Science and Scopus will continue to innovate, but they may face increasing competition from these new, more accessible platforms.


Explore the Data and Code Yourself!

If you’re interested in diving deeper into this analysis, I’ve made my Python code available on GitHub. You can see how I used OpenAlex’s API to download the data, clean it up, and generate these insights. Feel free to explore the code, try it out yourself, and maybe even contribute your own ideas!


Let’s Discuss: What’s Your Experience with Research Databases?

Which databases do you use the most in your research? How do you think the landscape will evolve over the next few years? I’d love to hear your thoughts—drop a comment below and let’s discuss how we can better navigate this ever-changing world of research data!

patrick-delio KOUETCHEU KAMDEM

Infirmier responsable des Interventions d'urgence pré- hospitalier chez Cameroun Assistance Sanitaire SA

1 个月

Courage Dr

回复

要查看或添加评论,请登录

Maxime Descartes Mbogning的更多文章

社区洞察

其他会员也浏览了