??GPT-4 for Cell Type Annotation in sc-RNA, ?? Beyond Normalization: Gene Expression Analysis??, ??? Ampliseq and metatdenovo: nf-core Pipelines

??GPT-4 for Cell Type Annotation in sc-RNA, ?? Beyond Normalization: Gene Expression Analysis??, ??? Ampliseq and metatdenovo: nf-core Pipelines

Bioinformer Weekly Roundup

Stay Updated with the Latest in Bioinformatics!

Issue: 31 | Date: 5 April 2024

?? Welcome to the Bioinformer Weekly Roundup!

In this newsletter, we curate and bring you the most captivating stories, developments, and breakthroughs from the world of bioinformatics. Whether you're a seasoned researcher, a student, or simply curious about the intersection of biology and data science, we've got you covered. Subscribe now to stay ahead in the exciting realm of bioinformatics!

?? In The Spotlight

?? Featured Research

Integrated analysis of single-cell and bulk RNA-sequencing reveals a novel signature based on NK cell marker genes to predict prognosis and immunotherapy response in gastric cancer | bioRxiv

This study establishes a robust NK cell signature for assessing immunotherapy and prognosis in gastric cancer (GC). Using single-cell RNA-sequencing data, 377 marker genes were identified, forming a 12-gene NK cell-associated signature (NKCAS). The NKCAS effectively stratified patients into low and high-risk groups, with validated predictive power across multiple cohorts. It also served as an independent prognostic factor and integrated into a nomogram for survival prediction.

Towards a unified medical microbiome ecology of the OMU for metagenomes and the OTU for microbes | BMC Bioinformatics

Metagenomic sequencing has revolutionized microbiology, enabling studies like the human microbiome project. This shift to microbiomes indicates our ability to characterize microbial communities akin to macrobiomes. Unlike traditional studies, metagenomics relies on DNA sequencing, yielding OTU (operational taxonomic unit) tables for microorganisms and OMUs (operational metagenomic units) for gene clusters.

Multiple Alzheimer’s disease progression pathways inferred from transcriptome data of the dorsolateral prefrontal cortex | bioRxiv

Multi-omics single-cell data offer insights into biological complexity by combining transcriptomic and epigenomic analyses. This study presents a bioinformatic workflow that integrates existing methods to analyze these datasets jointly, enhancing our understanding of cellular heterogeneity.

Exploring The Distribution of Single Nucleotide Polymorphisms Across Human Exons And Introns | bioRxiv

The study examines single nucleotide polymorphism (SNP) counts within specific exons and introns of the human genome, using data from 1,222 individuals of Polish descent. With a total of 41,836,187 SNPs analyzed, chromosomes 1 and 22 are highlighted due to their differing DNA molecule lengths. The findings reveal that outer exons and first introns exhibit notably higher SNP counts, indicative of their distinct functional significance within the genome.

Exploring the impact of sequence context on errors in SNP genotype calling with Whole Genome Sequencing data using AI-based autoencoder approach

The study explores variant calling accuracy in whole genome sequencing (WGS) data, a crucial step prone to errors. Using data from Holstein-Friesian cows, comparisons were made between WGS-derived SNPs and genotyping microarray data. An autoencoder model identified systematic errors, notably linked to nucleotide context and fluorescence patterns. These findings underscore the need for meticulous variant calling protocols to enhance WGS data accuracy.

??? Latest Tools

Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis | Nature Methods

The study presents GPT-4, a language model, for automated cell type annotation in single-cell RNA sequencing. GPT-4 shows better performance across diverse tissue types, offering a streamlined solution. Additionally, GPTCelltype, an R package, simplifies cell type annotation using GPT-4.

ampliseq: Amplicon sequencing analysis workflow using DADA2 and QIIME2 | nf-core Pipelines

The nfcore/ampliseq pipeline is a versatile tool for amplicon sequencing analysis. It supports denoising of various amplicons and multiple taxonomic databases like 16S, ITS, CO1, and 18S. It also facilitates phylogenetic placement and analysis of multiple regions like 5R. Compatible with Illumina (paired-end and single-end), PacBio, and IonTorrent data, its default setup is optimized for paired-end Illumina sequences targeting 16S rRNA gene amplicons.?

metatdenovo: Assembly and annotation of metatranscriptomic data, both prokaryotic and eukaryotic | nf-core Pipelines

metatdenovo is a bioinformatics pipeline for meta transcriptomic data assembly and annotation, covering prokaryotic and eukaryotic genomes. Developed using Nextflow, it ensures portability and reproducibility with Docker/Singularity containers. With one container per process, maintenance is simplified. Continuous integration tests on AWS validate its performance on real-world datasets.

Beyond Normalization: Incorporating Scale Uncertainty in Microbiome and Gene Expression Analysis | bioRxiv

This article proposes scale models as a superior alternative to traditional statistical normalizations in sequencing depth analysis. Current methods make assumptions about biological scale, leading to increased false positives and negatives. Scale models in ALDEx2 mitigate these errors, enhancing reproducibility and reducing false discovery rates.

TPMA: A two pointers meta-alignment tool to ensemble different multiple nucleic acid sequence alignments | PLOS Computational Biology

A new tool called TPMA is introduced for integrating nucleic acid sequence alignments, showing promising results in efficiency compared to existing methods. TPMA utilizes a two-pointer approach to optimize alignment by selecting high-scoring blocks from initial alignments. Experimental findings highlight TPMA's superior performance and propose efficient strategies for integrating diverse datasets.

TPMA is available here.

CITEViz: interactively classify cell populations in CITE-Seq via a flow cytometry-like gating workflow using R-Shiny | BMC Bioinformatics

Advancements in genomic sequencing have led to multi-omic single-cell assays like CITE-Seq, which capture RNA transcriptomes and surface protein expression simultaneously. However, existing tools still needs support for multi-omic datasets, necessitating redundant code. To address this, CITEViz enables interactive cell gating in Seurat-processed CITE-Seq data, streamlining the process and providing quality control metric visualizations for comprehensive data evaluation.

CAT-DTI: cross-attention and Transformer network with domain adaptation for drug-target interaction prediction | BMC Bioinformatics

CAT-DTI, a model for predicting drug-target interactions (DTI), addresses challenges in feature representation and model generalization by combining cross-attention and transformer techniques with domain adaptation. It captures DTI and enhances the prediction performance across diverse scenarios, showcasing advancements in DTI prediction.

Curare and GenExVis: a versatile toolkit for analyzing and visualizing RNA-Seq data | BMC Bioinformatics

Curare is a user-friendly workflow builder for high-throughput RNA-Seq data analysis, focusing on differential gene expression. It offers customizable analysis stages to ensure reproducibility and is complemented by GenExVis, facilitating swift and effortless visualization of differential gene expression results without data uploads or software installations. Together, they provide a comprehensive software solution for simplifying RNA-Seq data analysis and interpretation.

Feature-specific quantile normalization and feature-specific mean–variance normalization deliver robust bi-directional classification and feature selection performance between microarray and RNAseq data | BMC Bioinformatics

FSQN and FSMVN, two normalization methods, showed clinically equivalent performance for cross-platform data in colon CMS and breast PAM50 classification. Both effectively removed batch effects, with balanced accuracy matching within-platform data. Under optimal conditions, they performed similarly even with fewer selected genes. While effective for generating machine learning classifiers, subtle differences may exist, warranting caution with cross-platform data usage.

CTEC: a cross-tabulation ensemble clustering approach for single-cell RNA sequencing data analysis | Oxford Academic

CTEC is a method for single-cell RNA-seq data clustering that combines distribution and outlier-based re-clustering strategies through cross-tabulation. Benchmarking on five datasets shows CTEC's significant improvement over individual methods. Specifically, CTEC-DB outperforms state-of-the-art ensemble methods, with 45.4% and 17.1% improvements over SAFE and SAME, respectively, on the two-method ensemble test.

The source code is available here.

?? Community News

A single-cell RNA atlas reveals a potential target for breast cancer prevention | RNA-SEQ BLOG

Cambridge researchers identified immune cell dysfunction in healthy BRCA mutation carriers, signaling early cancer risk. This marks a first in non-cancerous breast tissue and proposes using immunotherapy drugs preventively. With Cancer Research UK support, mouse trials are planned, paving the way for clinical trials in mutation carriers.

Genetic analysis reveals hidden causes of chronic kidney disease in adults | News Medical Lifesciences

Chronic kidney disease (CKD) affects millions globally, often necessitating dialysis or transplant. While lifestyle factors and diseases cause most cases, genetic factors remain elusive in some. Tokyo Medical and Dental University researchers studied 90 CKD patients of unknown origin, excluding those with apparent causes. Their findings, published in Kidney International Reports, aim to uncover latent genetic conditions underlying CKD.

?? Upcoming Events

Curating proteins involved in Antimicrobial Resistance (AMR) in UniProt | EMBL-EBI

This webinar introduces UniProt's AMR-related resources, focusing on protein classes like beta-lactamases and efflux pumps. Attendees will learn to navigate UniProt's website for AMR information. Aimed at students and early-career scientists, it offers a general overview of UniProt's role in AMR research.

?? Educational Corner

Proteomics bioinformatics | EMBL-EBI

This course offers hands-on training in mass spectrometry (MS) and proteomics bioinformatics, covering search engines, post-processing software, and quantitative approaches. Participants will learn to analyze raw proteomics data, navigate MS data repositories, and perform functional annotation of proteins. Aimed at research scientists, the course equips participants with practical bioinformatics skills for proteomics data analysis.

A Practical Guide to Data Normalization in R | R Bloggers

Data normalization is vital for standardizing numeric features, ensuring unbiased treatment regardless of scale. In this tutorial, they have demonstrated data normalization in R through practical examples and detailed steps.

A starting guide on multi-omic single-cell data joint analysis: basic practices and results | bioRxiv

Multi-omics single-cell data offer insights into biological complexity by integrating different omics pools. However, leveraging this data poses challenges in consistent integration and analysis. This study presents a bioinformatic workflow combining existing methods to analyze transcriptomic and epigenomic single-cell data, advancing our understanding of cellular heterogeneity.

?? Connect with Us

Stay connected and engage with us on social media for daily updates, discussions, and more!

?? Subscribe

Don't miss an issue! Subscribe to the Bioinformer Weekly Roundup and receive the latest insights directly in your inbox.

Subscribe Now

We hope you enjoyed this week's edition of the Bioinformer Weekly Roundup. Feel free to share it with your colleagues and friends who share your passion for bioinformatics!


Disclaimer: The information provided in this newsletter is for educational and informational purposes only and does not constitute professional advice.

Contact: [email protected]

?

Copyright ? 2024, Bioinformer Weekly Roundup. All rights reserved.


?

?

要查看或添加评论,请登录

Zifo Bioinformatics的更多文章

社区洞察

其他会员也浏览了