Step-by-Step Guide to Download RNA-Seq Data from NCBI GEO
Hafiz Ahsan Ashfaq
Microbiologist | Molecular Biologist | Forensic Scientist | Bioinformatician | Data Scientist | Business Developer | PhD Position Seeker | Academic Writer | Content Writer |
Go to the NCBI GEO website:
Open the NCBI GEO website at https://www.ncbi.nlm.nih.gov/geo/.
Search for the dataset:
The search bar is used to enter keywords for the dataset of interest in RNA-Seq. You can search based on study name, organism, or even using keywords such as "RNA-Seq.".
Locate the appropriate dataset:
Scroll through the results to find the study or dataset of interest. Keep in mind that studies usually classified as "Series" (GSE) or "Samples" (GSM) are RNA-Seq datasets. Click on the title of the dataset.
Access the dataset page:
The complete description of the study, such as experimental design, organism, and sample data can be found on the dataset landing page.
Download individual sample files:
On this page, under the heading Download Options are links to the SRA, or Sequence Read Archive, that represent the raw RNA-Seq data. You may be able to download a few representative sample data files in FASTQ format from here.
领英推荐
Use the NCBI SRA Toolkit for bulk downloads:
For example, for downloading with the prefetch command, you would type this: prefetch SRRxxxxxxx.
For instance : fastq-dump --split-files SRRxxxxxxx.
7. Explore processed data:
If one is interested in already processed RNA-Seq, for example, read counts or expression levels, it is possible to download directly from GEO dataset page, in a format such as TXT or CSV, from the sections "Data table" or "Supplementary files".
Conclusion
RNA-Seq data may be downloaded from the NCBI GEO manually through their website or in bulk with the use of the SRA Toolkit. Depending on the requirement for downstream analysis, it's possible to download either raw or processed data.