AAV Data Hub Relaunch
Addgene co-founder and CSO Melina Fan, PhD,?writes about the process of developing, and expanding, Addgene's data-sharing platform, the AAV Data Hub.
Data sharing is essential for scientific progress. This need is partially addressed via publications, but publications leave out valuable data and important details about how the experiments were conducted. It is also difficult to compare similar experiments across publications to extract meaningful trends. The NIH and other funding agencies require data sharing and encourage deposition to data repositories for this reason. In fact, the NIH recently launched new set of data sharing requirements aimed at making data even more accessible.?
In response to this growing need for increased data sharing, Addgene has created a data sharing platform for Adeno-Associated Virus (AAV) data with plans to expand to include antibody data. AAV are versatile vehicles for gene delivery used clinically for gene therapy and in basic science for a variety of disciplines, especially by neuroscientists, to deliver tools, such as biosensors and optogenetic actuators, in vivo. Despite their widespread use, basic information about which AAV to use in specific cell types and what parameters to use for optimal expression are lacking. In line with Addgene’s commitment to sharing and speeding science, the AAV Data Hub addresses this need in the scientific community.??
Addgene is uniquely well suited to host a data sharing platform for AAV because of its experience with database and web design and because of its connection to a wide variety of scientists working with AAV. However, building and populating a data sharing platform poses several challenges, which will be described here and will hopefully be illustrative for others embarking on similar projects.?
Step 1: Designing and Building the Platform
Our initial AAV Data Hub, launched in 2019, was a very simple pilot primarily used to gauge community interest. Drawing from this experience, and generalizing our approach with the plans to go beyond AAV materials, we designed and built a a scalable software platform for sharing experimental reports.
To achieve this flexibility, our platform has two conceptual layers: (1) a data-type-specific layer that allows curators and domain experts to specify the structure of a report about a particular material and application (i.e. what data and how it is reported) and store it as a report schema; and (2) a data-type-agnostic layer that generates various user interfaces based on these report schemas. This layered approach allows for expansion and customization of the Data Hub to support different research materials and applications.
Designing and building the Data Hub platform was a collaboration between our Product, Application Development and Platform Engineering teams. I am grateful for the forward thinking and expertise of Daniela Bourges, Sophia Cheng, Cayla Fauver, Matt Ghantous, Jeremy Lapine, Pranav Mujumdar, Caroline Schumacher, Esmer Smith, Jason Snair, Ketaki Udipi, Morgan Wahl, Heather Zirkle, and all who contributed to this project.
Step 2: Populating the Platform with Data
Two of the primary challenges that crowdsourcing data sharing platforms face is how to incentivize submissions and how to control the quality of the submissions. Overcoming these challenges was a central consideration in the design and implementation of the Data Hub.?
Incentivizing?
While scientists often believe in open sharing, it takes time to locate and upload the data, which can make it easy to put off, especially when experiments and other responsibilities are also vying for attention. Incentives can encourage scientists to prioritize this kind of work.?
To address these concerns, Addgene is implementing several types of incentives, including:
We are also considering additional incentives, including:?
领英推荐
We are still testing these options to see what works best. In our experience, we've found that incentives work best when listening to our community. We therefore are encouraging any scientist to reach out to us with ideas on powerful incentives.
Thus far, we have collected over 100 reports on AAV usage spanning six different species and dozens of different expression sites.?If you are interested in contributing data, you can learn how to here.
Curation and Quality
Once a scientist enters their data, the next challenge is curating the data for quality. This is first addressed upfront with submission requirements: Addgene requires an image and a minimum set of information about how the experiment was conducted. During the data submission process, the data is standardized to be more easily readable with other data entries at Addgene and from other repositories. For example, the names of brain anatomical regions are consistent with the Neuronames database to be aligned with FAIR principles. Currently, each entry in the AAV Data Hub has been curated by a PhD scientist at Addgene, which is quite time consuming. We are considering ways to scale this process as we expand the resource. This could potentially come through use of a thoughtfully designed submission process that retains ease of use while allowing for standardized, detailed submissions or through community support with curating data submissions.
It all came together when we were finally able to share data from scientists using the AAV tools on our platform. A recent submission from Eugene Dimitrov's lab, submitted with the updated process, shows that you can fill entire interneurons in the mPFC with fluorescence, using an AAV1 Cre-expresser in the amygdala, with pAAV-hDlx-Flex-dTomato-Fishell_7 (AAV9) filling dendritic spines. Another submission, from JJ Kim and Dan O'Connor, shows that direct injection of one of Addgene’s PHP.S viruses (pAAV-CAG-tdTomato), which are known for their use in systemic injections, in the mouse trigeminal ganglion lights up entire neurons selectively, with accompanying light sheet images. Both entries have DOIs for easy referencing.
Soliciting and curating data for the Data Hub required creativity and persistence. This portion of the project was spearheaded by Jason Nasse, Leila Haery, Angela Abitua, and Brook Pyhtila.
Join the Data Sharing Movement
The value of the data hub grows as more entries are added. If you are an AAV creator, we encourage you to deposit your plasmids to Addgene, and if you use AAV in your studies, we encourage you to submit data from your experiments.
We hope that you will find the AAV Data Hub to be helpful as you set up your own experiments. We welcome your feedback so that we can make this resource even better for the community. As the database grows, we hope to mine the data to extract patterns that we can share with you about promoters, serotypes, expression patterns, and more.?
Addgene’s open science contributions began with sharing of physical materials and have expanded to sharing of information. The Data Hub is still in its early days and already receives approximately 1200 views per month, which shows the value of data sharing. As we look ahead, however, one concern is the cost to build, solicit, curate, and maintain such a database. The question of how public repositories should be supported is important for the community to consider. We are grateful to the Chan-Zuckerberg Initiative for supporting this project.
Please contact us at [email protected] if you have any questions or suggestions.
The AAV Data Hub is supported by a grant from the Chan Zuckerberg Initiative. The content of this article is solely the responsibility of the authors and does not necessarily represent the views of the Chan Zuckerberg Initiative.
works on fiver
1 年https://www.fiverr.com/s/LzBrrp