Researchers dislike Researchfish. Should funders too?
In 2021, while I was a Research Manager at ARUK, 8 months before the #ResearchfishGate incident and a year before Elsevier acquired Interfolio, Researchfish's parent company, we decided to stop using the platform to monitor our research progress.
We made that decision after my analysis of our historical data showed that despite what Researchfish claimed, the data was not of enough quality to:
In my new role I have had a couple of conversations with research funders over the last few weeks, and it's become clear that other funders are as well questioning the accuracy of their own data. With that in mind, I thought I would do a deep-dive on the data issues that we found at ARUK, and the steps we decided to take to more effectively track research impact.
I have spoken at a couple of conferences before about this work, so if you are interested about this content in video format you can find one here.
This article will be the first in the series: today we will look at the data, and then I will do a follow-on article touching on some of the reasons why I think this was happening, and how the new reporting process I developed tackled some of those issues.
Now, I really do not want this to be an unfair takedown on Researchfish, because a lot of the issues that lead to poor data quality are not only related to the platform itself, but to the culture around metrics in research funding, and to the ways funders communicate their use of Researchfish. Researchfish can be useful, if funders understand its limitations. And while some of what I describe below relates to poor reporting practices by a minority of researchers, I also do not put the blame on them, but on the set of incentives that the research community receives. More on this in the second part of this series.
With that out of the way, let’s now look at the data:
Ballooned numbers of publications
Around 60% of the publications Researchfish said ARUK had funded did not match with what other scientific databases said we had funded.
So we went and manually checked a random selection of ~100 of those discordant publications. We found that roughly 20 of them were true ARUK-funded publications that other databases were not accurately indexing, while the other 80 or so were completely not related to the grants they were linked to. Some had absolutely nothing to do with the field of dementia research. I guess it is still technically possible that some of those publications were possible in part thanks to ARUK, and the authors forgot to acknowledge us in the manuscript. But still.
If that proportion expanded to the whole dataset, it meant that likely roughly half of ARUK publications in Researchfish were actually not funded by ARUK, and were not related to our research portfolio.
How could this happen?
Well, if you have reported through Researchfish you will know that, at least back in 2021, publications were the easiest output to report – all it took was uploading a csv with a list of your DOIs and then Researchfish did the rest to complete the data that funders would see in their reports. In contrast, every other output in the platform required to fill several drop down menus and tick boxes, and answer the much feared “What was the impact of this output?”.
I believe that some of our researchers were uploading the exact same csv with all their DOIs, likely across all the funders they had grants with. Some quick checks led me to believe that not many researchers did this, but the few who did significantly skewed the data, rendering the entire dataset almost unusable.
领英推荐
If you want to roughly know how many publications you have funded, my experience says that Researchfish is not the way to go.
Infinite leveraged funding
Leveraged funding is conceptually challenging to calculate. Why should any one funder claim as leveraged funding grants from a researcher that has been funded by several other funders? What is the fair split? Should you dive into the science of the follow-on grants? Or can you argue that the papers you helped the researchers produce, helped them achieve more grants, no matter the topic? Will the other funders also claim the same amount of leveraged funding? Have we found a loophole to report an infinite pot of research funding in this country?
When I manually checked all the instances of leveraged funding reported against our grants, I was able to shave off £200m of research attributed to ARUK’s grants.
Someone reported the entirety of an international grant that gave funding to ~50 institutions worldwide, of which they were the research lead on one of those institutions. We had someone else attributing to ARUK the entirety of the European funds in the area of Life Sciences to one of UK’s devolved nations, because they sat on a panel that helped allocate funding. There are more examples, but you get the idea.
As amusing as this exercise was, it led to the discovery that this data required vast amounts of manual cleaning, and since then has made me squint every time I see other funders reporting their leveraged funding or their “for every 1£ invested we leveraged….” stats using Researchfish data.
Almost non-existent collaboration networks
When I did a bibliometric analysis of ARUK’s publications, and of the dementia research field as a whole, I saw that there had been an upward trend in collaborations over the last couple of decades. Papers now tend to have more co-authors, from more organisations, and scientists tend to work more and more internationally.
Our Researchfish data told a completely different story.
The data showed spikes and downturns in collaborations year after year, despite growing numbers of grants and publications, depending on which researchers had active grants, and whether they were accurate or not in their reporting. Some grants having 5, 10, dozens of publications reported against them, with only one or two collaborations reported. For some grants, despite knowing from the applications that they were significant collaborative efforts, zero collaborations were reported.
In conclusion, another metric from Researchfish that had no practical application, and that, much like publications, we could calculate much more faithfully using bibliometric analyses without putting the onus of the reporting on our research community.
Conclusion
I hope this post shows that when thoughtful analysis goes into looking at data coming from Researchfish, it is possible that some serious questions about the accuracy and the value you can get from it arise. Now, I don’t think there is no value at all in it – when carefully cleaned and presented I believe it can help show to non-expert audiences how research works, and the importance of producing and caring about diverse outputs.
In the second post of this series I will dive into some of the reasons why I think this happens: why would some researchers over-report publications, and under-report their collaborations, beyond the clunky user interface in the platform? How is funding culture and practice still contributing to “impact reporting” issues? What do funders do with all those reports, anyway? In the post I will also walk through how we decided to start from scratch our research reporting process, and how things went for the first two years after its implementation.
Multidisciplinary Institute of Ageing
1 年Great blog. Brings back memories ??
https://zenodo.org/records/7310872#.Y3Zf93bP3IU