WDL 1.2.0: Latest Release??,  CellBin: Spatial Transcriptomics Pipeline??, ADMET-AI??: Evaluating Large-scale Chemical Libraries using ML??

WDL 1.2.0: Latest Release??, CellBin: Spatial Transcriptomics Pipeline??, ADMET-AI??: Evaluating Large-scale Chemical Libraries using ML??

Bioinformer Weekly Roundup

Stay Updated with the Latest in Bioinformatics!

Issue: 43 | Date: 28 June 2024

?? Welcome to the Bioinformer Weekly Roundup!

In this newsletter, we curate and bring you the most captivating stories, developments, and breakthroughs from the world of bioinformatics. Whether you're a seasoned researcher, a student, or simply curious about the intersection of biology and data science, we've got you covered. Subscribe now to stay ahead in the exciting realm of bioinformatics!

?? Featured Research

Cell type-specific expression of angiotensin receptors in the human lung with implications for health, aging, and chronic disease | bioRxiv

The study examines the renin-angiotensin system’s role in lung disorders using RNA-sequence datasets. It identifies distinct localization patterns for two angiotensin receptors, AGTR1 and AGTR2, in the human lung. The study also finds an association between AGTR2 and lung phenotype, providing insights into this pathway’s role in lung homeostasis.

ETMR-14. THE SINGLE-CELL LANDSCAPE OF PINEOBLASTOMA IDENTIFIES AN ONCOGENIC PHOTORECEPTOR PROGRAM AS A TUMORIGENIC VULNERABILITY UNIFYING DISTINCT EMBRYONAL CNS TUMOR ENTITIES | PMC

Pineoblastoma (PB) is a rare childhood brain tumour. Using a single-nucleus RNA-sequencing cohort of primary PB tumours, the study maps the origins of PB subgroups to the mouse pineal gland’s developmental stages. It identifies CRX, OTX2, and NEUROD1 as master regulators of an oncogenic photoreceptor program across PB subgroups. Interestingly, this program is also prevalent in retinoblastoma and Group3 medulloblastoma, suggesting a shared oncogenic dependency among these distinct CNS tumours.

Prediction of Alzheimer's disease progression within 6 years using speech: A novel approach leveraging language models | Alzheimer's & Dementia

The study presents an automated method for predicting Alzheimer’s disease (AD) progression within 6 years in individuals with mild cognitive impairment (MCI). It uses natural language processing and machine learning techniques on speech data, along with age, sex, and education level. The models achieved an accuracy of 78.5% and a sensitivity of 81.1%. The study highlights the potential of AI-powered pipelines in facilitating remote and cost-effective screening and prognosis for Alzheimer’s disease.

Can large language models understand molecules? | BMC Bioinformatics

The research explores the use of Large Language Models (LLMs) like GPT and LLaMA in cheminformatics, specifically in understanding SMILES, a method for representing chemical structures. The study observes that SMILES embeddings generated using LLaMA achieve better results than those from GPT in molecular property and drug-drug interaction prediction tasks. The findings suggest potential for further exploration of LLMs in molecular embedding.

Source code is available here.

Deep learning-based localization algorithms on fluorescence human brain 3D reconstruction: a comparative study using stereology as a reference | Scientific Reports

The study evaluates three deep learning techniques (StarDist, CellPose, and BCFind-v2) for 3D reconstruction of human brain volumes. It focuses on the Broca’s area and compares methods based on predicted density, localization, computational efficiency, and human annotation effort. The results suggest that these techniques are effective in providing each cell’s 3D location and offer results comparable to the adopted stereological design.

??? Latest Tools

WDL 1.2.0: Enhancing Workflow Description Language for Bioinformatics | InfoQ

The Workflow Description Language (WDL) team has released WDL 1.2.0, a major update enhancing the flexibility and usability of workflow descriptions in bioinformatics. This version introduces features and enhancements to streamline workflow management and execution, simplifying the implementation and management of complex bioinformatics workflows for developers and researchers.

CellBin: a highly accurate single-cell gene expression processing pipeline for high-resolution spatial transcriptomics | bioRxiv

CellBin, a one-stop pipeline for high-resolution spatial transcriptomic data of Stereo-seq. It offers a comprehensive platform for generating high-confidence single-cell spatial gene expression profiles. It includes image stitching, image registration, tissue segmentation, nuclei segmentation, and molecule labelling. The study highlights that CellBin is user-friendly and improves the signal-to-noise ratio of single-cell gene expression data. It has been shown to obtain accurate single-cell spatial data using mouse brain tissue.

DDN3.0: determining significant rewiring of biological network structure with differential dependency networks | Oxford Academic Bioinformatics

DDN3.0 is a tool for differential network analysis, which is crucial for understanding complex diseases. DDN3.0 enhances the framework with three efficient algorithms for unbiased model estimation, multiple acceleration strategies, and data-driven determination of hyperparameters. The tool is designed to jointly learn common and rewired network structures under different conditions, and it can help identify a network of significantly rewired molecular players potentially responsible for phenotypic transitions.

PredGCN: A Pruning-enabled Gene-Cell Net for Automatic Cell Annotation of Single Cell Transcriptome Data?| Oxford Academic Bioinformatics

Pruning-enabled Gene-Cell Net (PredGCN) is a tool designed to address limitations in automatic cell type annotation from single-cell transcriptomics. PredGCN incorporates a Coupled Gene-Cell Net (CGCN) and integrates a Gene-Splicing Net (GSN) and a Cell Stratification Net (CSN) with a pruning operation. It leverages multiple feature extraction methods and region demarcation principles for precise cell identification.

Source code is available here.

Entourage: all-in-one sequence analysis software for genome assembly, virus detection, virus discovery, and intrasample variation profiling | BMC Bioinformatics

Entourage is a tool designed to address the challenges in pan-virus detection and virome investigation. Entourage enables short-read sequence assembly, viral sequence search, and intrasample sequence variation quantification. It offers end-to-end virus sequence detection analysis through a single command line. The tool’s utility is demonstrated through its application on HeLa cell culture samples and a preassembled Tara Oceans dataset.

PxBLAT: an efficient python binding library for BLAT | BMC Bioinformatics

PxBLAT is a Python-based framework designed to enhance the BLAT sequence alignment tool. PxBLAT offers significant improvements in execution speed and data handling, as demonstrated by benchmarks conducted across various sample groups. It also introduces user-friendly features such as improved server management, data conversion utilities, and shell completion.

Source code is available here.

ADMET-AI: A machine learning ADMET platform for evaluation of large-scale chemical libraries?| Oxford Academic Bioinformatics

ADMET-AI, a machine learning platform, has been developed to provide quick and accurate predictions of Absorption, Distribution, Metabolism, Excretion, and Toxicity (ADMET) properties. It is available both as a web-based ADMET predictor and as a Python package for local execution, capable of predicting for one million molecules in just 3.1 hours.

Source code is available here.

GENTANGLE: integrated computational design of gene entanglements | Oxford Academic Bioinformatics

GENTANGLE is a pipeline developed for the computational design of two overlapping genes in different reading frames of a microbial genome. This technique enhances the reliability of control mechanisms in engineered organisms. The software can be used to design and test gene entanglements for microbial engineering projects.

Source code is available here.

Protein interaction explorer (PIE): a comprehensive platform for navigating Protein-Protein interactions and ligand binding pockets | Oxford Academic Bioinformatics

Protein Interaction Explorer (PIE) is a tool integrated with the iPPI-DB database, designed to support structure-based drug discovery initiatives focused on protein-protein interactions. It provides a comprehensive suite of tools to aid in decision-making in PPI drug discovery, including identifying and characterizing crucial factors such as binding pockets and predicting hot spots.

Source code is available here.

Hictk: blazing fast toolkit to work with .hic and .cool files?| Oxford Academic Bioinformatics

hictk, a new toolkit, has been developed to operate on .hic and .cool files used in Hi-C data processing. It outperforms existing tools and provides the flexibility of working natively with both file formats. The toolkit includes a C++ library with Python and R bindings and CLI tools for common operations, including format conversion.

Source code is available here.

?? Community News

Diabetes decoded: How your gut microbiome influences disease risk | News Medical Life Sciences

A recent study in Nature Medicine explored the role of microbial features in type 2 diabetes. Scientists analysed over 8,000 shotgun metagenomic sequences from individuals with varying glycaemic statuses to understand how specific subspecies and strains contribute to the disease’s pathological mechanisms. The study found that gut microbiome dysbiosis plays a role in mechanisms such as glucose metabolism and butyrate fermentation, along with other findings that provide insights into the gut microbiome's association with type 2 diabetes.

New Computational Tool Elucidates How Deep Neural Networks Interpret Genomic Data | GEN Genetic Engineering Biotechnology News

A new computational tool, Surrogate Quantitative Interpretability for Deepnets (SQUID), has been developed by the Simons Centre for Quantitative Biology at Cold Spring Harbor Laboratory. This tool uses deep neural networks to interpret how AI models analyse genomic data, bringing us closer to understanding the inner workings of AI in genomics.

?? Educational Corner

How to add boxplots or density plots side-by-side a scatterplot: a single cell case study | DNA Confesses Data Speaks

The blog post highlights the ggside R package as a powerful tool for visualizing data, particularly in the context of single-cell RNA sequencing (scRNA-seq) datasets. By leveraging the flexibility of ggplot2, ggside enables users to create side-by-side plots that simultaneously display multiple variables such as gene expression, cell types, and experimental conditions.

?? Connect with Us

Stay connected and engage with us on social media for daily updates, discussions, and more!

?? Subscribe

Don't miss an issue! Subscribe to the Bioinformer Weekly Roundup and receive the latest insights directly in your inbox.

Subscribe Now

We hope you enjoyed this week's edition of the Bioinformer Weekly Roundup. Feel free to share it with your colleagues and friends who share your passion for bioinformatics!


Disclaimer: The information provided in this newsletter is for educational and informational purposes only and does not constitute professional advice.

Contact: [email protected]

Copyright ? 2024, Bioinformer Weekly Roundup. All rights reserved.



要查看或添加评论,请登录

社区洞察

其他会员也浏览了