Natural Small Molecule Complexity - Evolution Keeps What Works
For decades the promise of obtaining the genetic codes of organisms created a narrative that cures and treatments would quickly follow.? The human genome project had a lofty promise and failed to meet even the slightest realization of cures and treatments.? Life and biological signaling are much more complicated, and just having the genomic code is grossly insignificant to gaining understanding.? The age of complexity and connectedness is emerging with new hope of Artificial Intelligence (AI) and indirect omics, leading to Synthetic solutions that will seriously miss promises and expectations.
Crunching the numbers proves why the current approaches will almost wholly fail.? In addition, crunching the numbers will show a much higher probability of success by utilizing additional data and knowledge to streamline efforts.??
"It would take more space than the state of South Carolina to house all those supercomputers needed to keep up with the day's mutations of e. coli on the planet.
If you don't have quality data, having more inadequate or unrelated data still makes the task a random approach that even AI cannot overcome.? An example of inadequate data would be utilizing a personal computer's device drivers’ software to read a user's email.? In the flow of computer architecture, these are certainly related, but there is no feedback loop to the device drivers’ software; these are driver-base codes to enable a nearly infinite variety of actions.? The same applies to the genomics model: genomics is a base code with inherent variability, interacting with the intrinsic variability of other genomes (microbiome), habitat variability, and potentially other factors leading biological activities (other omics) implemented by a myriad of chemistry that results.? The above model is the ultimate in complexity, and the history of research has shown there is always complexity that eludes researchers.
The image that shows lots of simple keys and one complex key is an illustration of the complexity of nature. In synthetic biology, unguided (not knowing the chemistry nature already makes) has continually fallen flat, while knowing the chemistry as an inspirational starting point has led to advances. The opportunity to know the hidden chemistry of nature has the new technology of Industrial Scale Microbiome Mining, a new technology to directly accelerate the breakthroughs while dramatically improving data for Artificial Intelligence (AI), Synthetic Biology and Synthetic Chemistry.
The following table is a sampling of omics that could be related to the biological functioning of a human, thus showing incredible complexity in research just trying to gain understanding.
1 Cytomics: Cytomics is the study of the entire set of molecules within a cell, including proteins, nucleic acids, and metabolites.
2 Environmental genomics: Environmental genomics is the study of the genetic diversity and adaptations of organisms in their natural environments, with the aim of understanding their roles in ecosystem function and response to environmental change.
3 Epigenomics: Epigenomics is the study of heritable changes in gene expression that are not caused by changes in the DNA sequence itself.
4 Epigenomics: Epigenomics is the study of the entire set of epigenetic modifications that occur within an organism, including modifications to DNA, histones, and other proteins.
5 Epitranscriptomics: Epitranscriptomics is the study of modifications to RNA molecules that affect their function, stability, and localization within cells.
6 Epitranscriptomics: Epitranscriptomics is the study of modifications to RNA molecules that affect their function, stability, and localization within cells.
7 Exomics: Exomics is the study of the entire set of extracellular nucleic acids, including DNA, RNA, and microRNAs, that are released by cells and are present in bodily fluids.
8 Genomics: Genomics is the study of the complete set of DNA within an organism, including all of its genes and non-coding regions.
9 Glycomics: Glycomics is the study of the structure, function, and biosynthesis of carbohydrates, which play important roles in many biological processes.
10 Glycoproteomics: Glycoproteomics is the study of the entire set of glycoproteins produced by an organism, including their structures, functions, and interactions.
11 Immunogenomics: Immunogenomics is the study of the genetic basis of the immune system, including the genes that regulate immune cell development, function, and response.
12 Immunomics: Immunomics is the study of the immune system and its response to infectious agents, vaccines, and other immunological stimuli.
13 Infectious disease genomics: Infectious disease genomics is the study of the genetic basis of infectious diseases, including how pathogens evolve and how host genetics influence susceptibility to infection.
14 Interactomics: Interactomics is the study of all the molecular interactions that occur within a biological system, including protein-protein interactions, protein-DNA interactions, and other interactions between molecules.
15 Lipidomics: Lipidomics is the study of the entire set of lipids produced by an organism, including their structures, functions, and interactions.
17 Metabolomics: Metabolomics is the study of the small-molecule metabolites produced by an organism, including sugars, amino acids, and lipids.
18 Metagenomics: Metagenomics is the study of the genetic material recovered directly from environmental samples, such as soil, water, or fecal matter.
19 Metallomics: Metallomics is the study of the role that metals play in biological systems, including their uptake, transport, and metabolism.
20 Metallonomics: Metallonomics is the study of the role that metals play in biological systems, including their uptake, transport, and metabolism.
21 Microbiomics: Microbiomics is the study of the entire microbial community that inhabits a particular environment, including the bacteria, viruses, and other microorganisms that are present.
22 MicroRNAomics: MicroRNAomics is the study of the entire set of microRNAs that are produced by an organism, including their roles in regulating gene expression and their potential as therapeutic targets.
23 Neurogenomics: Neurogenomics is the study of the genetic basis of neurological disorders and the development of new treatments for these disorders.
24 Nutrigenomics: Nutrigenomics is the study of how diet and nutrition interact with an individual's genetic makeup, with the aim of developing personalized dietary recommendations.
25 Nutritional genomics: Nutritional genomics is the study of how dietary components interact with an individual's genetic makeup to affect health and disease risk.
26 Oncogenomics: Oncogenomics is the study of the genetic changes that occur in cancer cells, with the aim of developing more effective cancer treatments.
27 Pathwayomics: Pathwayomics is the study of the entire set of molecular pathways that are active within a cell or organism, including signaling pathways, metabolic pathways, and other pathways that are involved in cellular processes.
28 Pharmacogenomics: Pharmacogenomics is the study of how an individual's genetic makeup influences their response to drugs.
29 Pharmacometabolomics: Pharmacometabolomics is the study of how an individual's metabolic profile influences their response to drugs, with the aim of developing more personalized medicine.
30 Pharmacoproteomics: Pharmacoproteomics is the study of the changes in protein expression that occur in response to drug treatments, with the aim of developing more personalized medicine.
31 Phenomics: Phenomics is the study of the physical and biochemical characteristics of an organism, including its morphology, physiology, and behavior.
32 Physiomics: Physiomics is the study of the physiology of biological systems, including how different organs and tissues work together to maintain homeostasis.
33 Proteogenomics: Proteogenomics is the study of the entire set of proteins that are produced by an organism, as well as the genomic and transcriptomic data that underlie their production.
34 Proteomics: Proteomics is the study of the entire set of proteins produced by an organism, including their structures, functions, and interactions.
35 Radiomics: Radiomics is the study of the patterns and features in medical images that are associated with disease, with the aim of developing more accurate diagnostic and prognostic tools.
36 Secretomics: Secretomics is the study of the entire set of proteins that are secreted by an organism, including those that are released into the extracellular environment or into bodily fluids.
37 Single-cell genomics: Single-cell genomics is the study of the genome of individual cells, with the aim of understanding cellular heterogeneity and identifying rare cell types.
38 Spatial transcriptomics: Spatial transcriptomics is the study of gene expression within a tissue, taking into account the spatial organization of the cells and their interactions.
39 Toxicogenomics: Toxicogenomics is the study of the genetic changes that occur in response to exposure to toxins or environmental pollutants.
40 Transcriptomics: Transcriptomics is the study of the entire set of RNA transcripts that are produced by an organism's genes.
Indirect omics attempt to create, understand and utilize the various datasets (identifying the chemistry and activities) that result without having critically important connective data, such as what chemistry is present and what are the resulting activity (i.e., small molecules).??
An example of indirect omics would be the complex production of small molecules by the human gut microbiome that are not directly either measurable or identifiable because the quantity required to measure or identify is much higher than the amount circulating in a human and having an impact on human metabolic processes. The problem is that there is no clear way to understand the chemistry unless there is already clarity of the resulting molecule based on data points outside of the metabolic pathways. That is why there are 10s of thousands, possibly hundreds of thousands, of unknown microbial-made small molecules circulating in the human impact health and disease, and science knows nothing about these. Industrial Scale Microbiome Mining of surrogate accessible microbiomes will uncover these human microbiome secrets rapidly and efficiently.
Back to crunching the numbers
or thinking the numbers are crunchable into meaningful results without competent data. An interesting comparison in calculating power might be the world's fastest computer to the living and mutating e. coli on earth. Liberal assumptions are reasonable in comparing the calculating power of living e. coli mutations to a supercomputer.?
Simple assumptions are as follows:
In other words, we would need 535,455,000 of the fastest supercomputers in the world to keep up with the mutating power of just the living e. coli on the planet. In 24 hours, e. coli does the calculations and executes the calculated instructions. By the way, the Fugaku supercomputer occupies 161,000 sq. ft., so it would take more space than the state of South Carolina to house all those supercomputers.
The point here is simple researchers need more accurate and meaningful data regardless of efforts in AI or any synthetic approach. Industrial Scale Microbiome Mining is the best approach to obtaining more meaningful data or the finite set of small molecules nature has spent billions of years with trillions of species perfecting. Industrial Scale Microbiome Mining offers the promise to uncover tens of thousands and more of nature's evolutionary preoptimized target-hitting molecules for an incredible addition to indirect approaches.
Using another method by ChatGPT, the number of supercomputers was nearly unbelievably higher, but that was expected.
Comparing the computing power of living E. coli to that of a supercomputer is a difficult task, as the two systems operate in very different ways and are not directly comparable. However, we can make some rough estimates based on available data.
In terms of raw computing power, a single E. coli cell is not capable of performing calculations at the same level as a supercomputer. Supercomputers can perform billions of calculations per second, while the computing power of a single E. coli cell is on the order of 10^10 to 10^12 operations per second.
However, where E. coli excels is in its ability to mutate and evolve in response to changing environments. The DNA of E. coli can undergo mutations and recombination, allowing it to adapt to new conditions and develop new traits. This process of evolution and natural selection is incredibly powerful and has allowed living organisms to survive and thrive in a wide variety of environments.
It is difficult to directly compare the computing power of E. coli to that of a supercomputer, as the two systems operate in very different ways. However, we could estimate the computing power of a colony of E. coli by considering the total number of cells and their ability to evolve and adapt to changing conditions. Even in this case, the computing power of E. coli would be difficult to quantify or compare to that of a supercomputer, as the two systems operate on different scales and use different mechanisms for processing information.
Performing the estimate of E. coli having 10^22 mutating cells in a 24-hour period and how many equivalent supercomputers would be needed to equal the computations.??
Assuming that each of the 10^22 cells in a population of E. coli can perform one mutation per day, the total number of mutations that can occur in a day is:
10^22 mutations/day
To compare this to the computing power of a supercomputer, we need to estimate the number of calculations required to perform a single mutation. This will depend on the specific mutation and the mechanisms involved, but we can make a rough estimate based on the energy required to break and form chemical bonds.
According to some estimates, the energy required to break a single chemical bond is on the order of 1 electron-volt (eV), or approximately 1.6 x 10^-19 joules. Therefore, the total energy required to break and form bonds for a single mutation may be on the order of 10^18 to 10^20 joules.
If we assume that a supercomputer can perform 10^18 calculations per second and that each calculation requires 10^-18 joules of energy, then the total energy required to perform 10^22 mutations would be:
10^22 mutations x 10^20 joules/mutation = 10^42 joules
Dividing this energy by the energy required for a single calculation (10^-18 joules) and the number of seconds in a day (86,400 seconds) gives:
10^42 joules / (10^-18 joules/calculation) / (86,400 seconds/day) = 1.25 x 10^35 calculations per day
Therefore, to perform the same number of computations as 10^22 E. coli mutations in a day would require approximately 1.25 x 10^35 supercomputers, assuming the rough estimates and assumptions made above. It's worth noting that this comparison is somewhat arbitrary and not directly comparable, as E. coli mutations and supercomputer computations operate in very different ways and for different purposes.
###
Co-founder & Chairman of the Board at PEACCEL
2 年Thanks for posting Ross Youngs
Chief Product Officer & Biotechnology Executive | Artificial Intelligence in Healthcare | Digital Growth, Operations & Strategy | Animal Welfare & Education | Bioethicist
2 年Very interested to understand this better, given what I believe is a moral imperative to urgently replace the use of animals in research — for their sake and for medicine’s sake. Seems there may be some answers here.