Dredd Blog: 2022-04-10

Friday, April 15, 2022

Quantum Biology - 5

I. Background

This series deals with something a lot of biologists, including microbiologists, don't like to consider (Quantum Biology, 2, 3, 4).

More specifically, this series of Dredd Blog posts deals with the DNA and RNA nomenclature used in GenBank and other genomic repositories which fail to properly distinguish between RNA and DNA to the detriment of our understanding (It's In The GenBank, 2, 3, 4, 5).

For example, the professor in the first video (in the first two minutes) at the bottom of the page pooh-poohs the difference between the two (the rest of the video is worthy).

Today I want to criticize that "parts is parts" approach to DNA and RNA molecules, like I would if someone told me that carbon 12, 13, and 14 aren't different enough to consider a big deal (DOE).

Atoms matter!

II. Appendices

The appendices to today's post show mutant codons (Appendix One) and the differences in atom counts (Appendix Two) in one nucleotide of DNA (T, thymine) compared to the relevant nucleotide of RNA (U, uracil).

The codons and amino acids related to mRNA (mRNA codons have "U" instead of "T") in the lines of Appendix Two follow (are on the following line) the codons of the lines of DNA codons.

This (Appendix Two) shows that the atom counts in DNA 'T' do not match the atom counts of RNA 'U' ('T' parts are not 'U' parts).

Considerable skill is required to move atoms out of or into a molecule, yet that happens during DNA -> mRNA transcription (see videos below).

As it were, one carbon atom and two hydrogen atoms are removed from 'T' which results in 'U' during codon transcription processing.

As you can see in those appendices, this is done in the cells of eucaryotes millions and millions of times daily.

III. Closing Comments

The common practice of some scientists these days is to play dolls with microbes (The Doll As Metaphor, 2, 3, 4).

I say that because way too many scientists support the overuse of antibiotics and other chemicals which kill so many beneficial microbes that they pollute the microbiome with chimeric fogs:

"Chimeras arise when a cell undergoes mutation. This mutation may be spontaneous or it may be induced by irradiation or treatment with chemical mutagens. If the cell which mutates is located near the crest of the apical dome, then all other cells which are produced by division from it will also be the mutated type. The result will be cells of different genotypes growing adjacent in a plant tissue, the definition of a chimera."

(Origin of Chimeras, cf. Soil Chemical and Microbiological Properties Are Changed, Some Of My Best Friends Are Germs - 2, It's In The GenBank - 4). It is not something to toy around with.

The next post in this series is here, the previous post in this series is here.

Appendix QB5 DNARNA

This is an appendix to: Quantum Biology - 5

Valid (DNA/RNA)Codon-Molecule Atom Counts
Codon	Amino Acid 1	Amino Acid 2	Amino Acid 3	Combined Atom Count
ATT	C₅H₅N₅	C₅H₆N₂O₂	C₅H₆N₂O₂	C₁₅H₁₇N₉O₄
AUU	C₅H₅N₅	C₄H₄N₂O₂	C₄H₄N₂O₂	C₁₃H₁₃N₉O₄
ATC	C₅H₅N₅	C₅H₆N₂O₂	C₄H₅N₃O₁	C₁₄H₁₆N₁₀O₃
AUC	C₅H₅N₅	C₄H₄N₂O₂	C₄H₅N₃O₁	C₁₃H₁₄N₁₀O₃
ATA	C₅H₅N₅	C₅H₆N₂O₂	C₅H₅N₅	C₁₅H₁₆N₁₂O₂
AUA	C₅H₅N₅	C₄H₄N₂O₂	C₅H₅N₅	C₁₄H₁₄N₁₂O₂
CTT	C₄H₅N₃O₁	C₅H₆N₂O₂	C₅H₆N₂O₂	C₁₄H₁₇N₇O₅
CUU	C₄H₅N₃O₁	C₄H₄N₂O₂	C₄H₄N₂O₂	C₁₂H₁₃N₇O₅
CTC	C₄H₅N₃O₁	C₅H₆N₂O₂	C₄H₅N₃O₁	C₁₃H₁₆N₈O₄
CUC	C₄H₅N₃O₁	C₄H₄N₂O₂	C₄H₅N₃O₁	C₁₂H₁₄N₈O₄
CTA	C₄H₅N₃O₁	C₅H₆N₂O₂	C₅H₅N₅	C₁₄H₁₆N₁₀O₃
CUA	C₄H₅N₃O₁	C₄H₄N₂O₂	C₅H₅N₅	C₁₃H₁₄N₁₀O₃
CTG	C₄H₅N₃O₁	C₅H₆N₂O₂	C₅H₅N₅O₁	C₁₄H₁₆N₁₀O₄
CUG	C₄H₅N₃O₁	C₄H₄N₂O₂	C₅H₅N₅O₁	C₁₃H₁₄N₁₀O₄
TTA	C₅H₆N₂O₂	C₅H₆N₂O₂	C₅H₅N₅	C₁₅H₁₇N₉O₄
UUA	C₄H₄N₂O₂	C₄H₄N₂O₂	C₅H₅N₅	C₁₃H₁₃N₉O₄
TTG	C₅H₆N₂O₂	C₅H₆N₂O₂	C₅H₅N₅O₁	C₁₅H₁₇N₉O₅
UUG	C₄H₄N₂O₂	C₄H₄N₂O₂	C₅H₅N₅O₁	C₁₃H₁₃N₉O₅
GTT	C₅H₅N₅O₁	C₅H₆N₂O₂	C₅H₆N₂O₂	C₁₅H₁₇N₉O₅
GUU	C₅H₅N₅O₁	C₄H₄N₂O₂	C₄H₄N₂O₂	C₁₃H₁₃N₉O₅
GTC	C₅H₅N₅O₁	C₅H₆N₂O₂	C₄H₅N₃O₁	C₁₄H₁₆N₁₀O₄
GUC	C₅H₅N₅O₁	C₄H₄N₂O₂	C₄H₅N₃O₁	C₁₃H₁₄N₁₀O₄
GTA	C₅H₅N₅O₁	C₅H₆N₂O₂	C₅H₅N₅	C₁₅H₁₆N₁₂O₃
GUA	C₅H₅N₅O₁	C₄H₄N₂O₂	C₅H₅N₅	C₁₄H₁₄N₁₂O₃
GTG	C₅H₅N₅O₁	C₅H₆N₂O₂	C₅H₅N₅O₁	C₁₅H₁₆N₁₂O₄
GUG	C₅H₅N₅O₁	C₄H₄N₂O₂	C₅H₅N₅O₁	C₁₄H₁₄N₁₂O₄
TTT	C₅H₆N₂O₂	C₅H₆N₂O₂	C₅H₆N₂O₂	C₁₅H₁₈N₆O₆
UUU	C₄H₄N₂O₂	C₄H₄N₂O₂	C₄H₄N₂O₂	C₁₂H₁₂N₆O₆
TTC	C₅H₆N₂O₂	C₅H₆N₂O₂	C₄H₅N₃O₁	C₁₄H₁₇N₇O₅
UUC	C₄H₄N₂O₂	C₄H₄N₂O₂	C₄H₅N₃O₁	C₁₂H₁₃N₇O₅
ATG	C₅H₅N₅	C₅H₆N₂O₂	C₅H₅N₅O₁	C₁₅H₁₆N₁₂O₃
AUG	C₅H₅N₅	C₄H₄N₂O₂	C₅H₅N₅O₁	C₁₄H₁₄N₁₂O₃
TGT	C₅H₆N₂O₂	C₅H₅N₅O₁	C₅H₆N₂O₂	C₁₅H₁₇N₉O₅
UGU	C₄H₄N₂O₂	C₅H₅N₅O₁	C₄H₄N₂O₂	C₁₃H₁₃N₉O₅
TGC	C₅H₆N₂O₂	C₅H₅N₅O₁	C₄H₅N₃O₁	C₁₄H₁₆N₁₀O₄
UGC	C₄H₄N₂O₂	C₅H₅N₅O₁	C₄H₅N₃O₁	C₁₃H₁₄N₁₀O₄
GCT	C₅H₅N₅O₁	C₄H₅N₃O₁	C₅H₆N₂O₂	C₁₄H₁₆N₁₀O₄
GCU	C₅H₅N₅O₁	C₄H₅N₃O₁	C₄H₄N₂O₂	C₁₃H₁₄N₁₀O₄
GCC	C₅H₅N₅O₁	C₄H₅N₃O₁	C₄H₅N₃O₁	C₁₃H₁₅N₁₁O₃
GCC	C₅H₅N₅O₁	C₄H₅N₃O₁	C₄H₅N₃O₁	C₁₃H₁₅N₁₁O₃
GCA	C₅H₅N₅O₁	C₄H₅N₃O₁	C₅H₅N₅	C₁₄H₁₅N₁₃O₂
GCA	C₅H₅N₅O₁	C₄H₅N₃O₁	C₅H₅N₅	C₁₄H₁₅N₁₃O₂
GCG	C₅H₅N₅O₁	C₄H₅N₃O₁	C₅H₅N₅O₁	C₁₄H₁₅N₁₃O₃
GCG	C₅H₅N₅O₁	C₄H₅N₃O₁	C₅H₅N₅O₁	C₁₄H₁₅N₁₃O₃
GGT	C₅H₅N₅O₁	C₅H₅N₅O₁	C₅H₆N₂O₂	C₁₅H₁₆N₁₂O₄
GGU	C₅H₅N₅O₁	C₅H₅N₅O₁	C₄H₄N₂O₂	C₁₄H₁₄N₁₂O₄
GGC	C₅H₅N₅O₁	C₅H₅N₅O₁	C₄H₅N₃O₁	C₁₄H₁₅N₁₃O₃
GGC	C₅H₅N₅O₁	C₅H₅N₅O₁	C₄H₅N₃O₁	C₁₄H₁₅N₁₃O₃
GGA	C₅H₅N₅O₁	C₅H₅N₅O₁	C₅H₅N₅	C₁₅H₁₅N₁₅O₂
GGA	C₅H₅N₅O₁	C₅H₅N₅O₁	C₅H₅N₅	C₁₅H₁₅N₁₅O₂
GGG	C₅H₅N₅O₁	C₅H₅N₅O₁	C₅H₅N₅O₁	C₁₅H₁₅N₁₅O₃
GGG	C₅H₅N₅O₁	C₅H₅N₅O₁	C₅H₅N₅O₁	C₁₅H₁₅N₁₅O₃
CCT	C₄H₅N₃O₁	C₄H₅N₃O₁	C₅H₆N₂O₂	C₁₃H₁₆N₈O₄
CCU	C₄H₅N₃O₁	C₄H₅N₃O₁	C₄H₄N₂O₂	C₁₂H₁₄N₈O₄
CCC	C₄H₅N₃O₁	C₄H₅N₃O₁	C₄H₅N₃O₁	C₁₂H₁₅N₉O₃
CCC	C₄H₅N₃O₁	C₄H₅N₃O₁	C₄H₅N₃O₁	C₁₂H₁₅N₉O₃
CCA	C₄H₅N₃O₁	C₄H₅N₃O₁	C₅H₅N₅	C₁₃H₁₅N₁₁O₂
CCA	C₄H₅N₃O₁	C₄H₅N₃O₁	C₅H₅N₅	C₁₃H₁₅N₁₁O₂
CCG	C₄H₅N₃O₁	C₄H₅N₃O₁	C₅H₅N₅O₁	C₁₃H₁₅N₁₁O₃
CCG	C₄H₅N₃O₁	C₄H₅N₃O₁	C₅H₅N₅O₁	C₁₃H₁₅N₁₁O₃
ACT	C₅H₅N₅	C₄H₅N₃O₁	C₅H₆N₂O₂	C₁₄H₁₆N₁₀O₃
ACU	C₅H₅N₅	C₄H₅N₃O₁	C₄H₄N₂O₂	C₁₃H₁₄N₁₀O₃
ACC	C₅H₅N₅	C₄H₅N₃O₁	C₄H₅N₃O₁	C₁₃H₁₅N₁₁O₂
ACC	C₅H₅N₅	C₄H₅N₃O₁	C₄H₅N₃O₁	C₁₃H₁₅N₁₁O₂
ACA	C₅H₅N₅	C₄H₅N₃O₁	C₅H₅N₅	C₁₄H₁₅N₁₃O₁
ACA	C₅H₅N₅	C₄H₅N₃O₁	C₅H₅N₅	C₁₄H₁₅N₁₃O₁
ACG	C₅H₅N₅	C₄H₅N₃O₁	C₅H₅N₅O₁	C₁₄H₁₅N₁₃O₂
ACG	C₅H₅N₅	C₄H₅N₃O₁	C₅H₅N₅O₁	C₁₄H₁₅N₁₃O₂
TCT	C₅H₆N₂O₂	C₄H₅N₃O₁	C₅H₆N₂O₂	C₁₄H₁₇N₇O₅
UCU	C₄H₄N₂O₂	C₄H₅N₃O₁	C₄H₄N₂O₂	C₁₂H₁₃N₇O₅
TCC	C₅H₆N₂O₂	C₄H₅N₃O₁	C₄H₅N₃O₁	C₁₃H₁₆N₈O₄
UCC	C₄H₄N₂O₂	C₄H₅N₃O₁	C₄H₅N₃O₁	C₁₂H₁₄N₈O₄
TCA	C₅H₆N₂O₂	C₄H₅N₃O₁	C₅H₅N₅	C₁₄H₁₆N₁₀O₃
UCA	C₄H₄N₂O₂	C₄H₅N₃O₁	C₅H₅N₅	C₁₃H₁₄N₁₀O₃
TCG	C₅H₆N₂O₂	C₄H₅N₃O₁	C₅H₅N₅O₁	C₁₄H₁₆N₁₀O₄
UCG	C₄H₄N₂O₂	C₄H₅N₃O₁	C₅H₅N₅O₁	C₁₃H₁₄N₁₀O₄
AGT	C₅H₅N₅	C₅H₅N₅O₁	C₅H₆N₂O₂	C₁₅H₁₆N₁₂O₃
AGU	C₅H₅N₅	C₅H₅N₅O₁	C₄H₄N₂O₂	C₁₄H₁₄N₁₂O₃
AGC	C₅H₅N₅	C₅H₅N₅O₁	C₄H₅N₃O₁	C₁₄H₁₅N₁₃O₂
AGC	C₅H₅N₅	C₅H₅N₅O₁	C₄H₅N₃O₁	C₁₄H₁₅N₁₃O₂
TAT	C₅H₆N₂O₂	C₅H₅N₅	C₅H₆N₂O₂	C₁₅H₁₇N₉O₄
UAU	C₄H₄N₂O₂	C₅H₅N₅	C₄H₄N₂O₂	C₁₃H₁₃N₉O₄
TAC	C₅H₆N₂O₂	C₅H₅N₅	C₄H₅N₃O₁	C₁₄H₁₆N₁₀O₃
UAC	C₄H₄N₂O₂	C₅H₅N₅	C₄H₅N₃O₁	C₁₃H₁₄N₁₀O₃
TGG	C₅H₆N₂O₂	C₅H₅N₅O₁	C₅H₅N₅O₁	C₁₅H₁₆N₁₂O₄
UGG	C₄H₄N₂O₂	C₅H₅N₅O₁	C₅H₅N₅O₁	C₁₄H₁₄N₁₂O₄
CAA	C₄H₅N₃O₁	C₅H₅N₅	C₅H₅N₅	C₁₄H₁₅N₁₃O₁
CAA	C₄H₅N₃O₁	C₅H₅N₅	C₅H₅N₅	C₁₄H₁₅N₁₃O₁
CAG	C₄H₅N₃O₁	C₅H₅N₅	C₅H₅N₅O₁	C₁₄H₁₅N₁₃O₂
CAG	C₄H₅N₃O₁	C₅H₅N₅	C₅H₅N₅O₁	C₁₄H₁₅N₁₃O₂
AAT	C₅H₅N₅	C₅H₅N₅	C₅H₆N₂O₂	C₁₅H₁₆N₁₂O₂
AAU	C₅H₅N₅	C₅H₅N₅	C₄H₄N₂O₂	C₁₄H₁₄N₁₂O₂
AAC	C₅H₅N₅	C₅H₅N₅	C₄H₅N₃O₁	C₁₄H₁₅N₁₃O₁
AAC	C₅H₅N₅	C₅H₅N₅	C₄H₅N₃O₁	C₁₄H₁₅N₁₃O₁
CAT	C₄H₅N₃O₁	C₅H₅N₅	C₅H₆N₂O₂	C₁₄H₁₆N₁₀O₃
CAU	C₄H₅N₃O₁	C₅H₅N₅	C₄H₄N₂O₂	C₁₃H₁₄N₁₀O₃
CAC	C₄H₅N₃O₁	C₅H₅N₅	C₄H₅N₃O₁	C₁₃H₁₅N₁₁O₂
CAC	C₄H₅N₃O₁	C₅H₅N₅	C₄H₅N₃O₁	C₁₃H₁₅N₁₁O₂
GAA	C₅H₅N₅O₁	C₅H₅N₅	C₅H₅N₅	C₁₅H₁₅N₁₅O₁
GAA	C₅H₅N₅O₁	C₅H₅N₅	C₅H₅N₅	C₁₅H₁₅N₁₅O₁
GAG	C₅H₅N₅O₁	C₅H₅N₅	C₅H₅N₅O₁	C₁₅H₁₅N₁₅O₂
GAG	C₅H₅N₅O₁	C₅H₅N₅	C₅H₅N₅O₁	C₁₅H₁₅N₁₅O₂
GAT	C₅H₅N₅O₁	C₅H₅N₅	C₅H₆N₂O₂	C₁₅H₁₆N₁₂O₃
GAU	C₅H₅N₅O₁	C₅H₅N₅	C₄H₄N₂O₂	C₁₄H₁₄N₁₂O₃
GAC	C₅H₅N₅O₁	C₅H₅N₅	C₄H₅N₃O₁	C₁₄H₁₅N₁₃O₂
GAC	C₅H₅N₅O₁	C₅H₅N₅	C₄H₅N₃O₁	C₁₄H₁₅N₁₃O₂
AAA	C₅H₅N₅	C₅H₅N₅	C₅H₅N₅	C₁₅H₁₅N₁₅O₀
AAA	C₅H₅N₅	C₅H₅N₅	C₅H₅N₅	C₁₅H₁₅N₁₅O₀
AAG	C₅H₅N₅	C₅H₅N₅	C₅H₅N₅O₁	C₁₅H₁₅N₁₅O₁
AAG	C₅H₅N₅	C₅H₅N₅	C₅H₅N₅O₁	C₁₅H₁₅N₁₅O₁
CGT	C₄H₅N₃O₁	C₅H₅N₅O₁	C₅H₆N₂O₂	C₁₄H₁₆N₁₀O₄
CGU	C₄H₅N₃O₁	C₅H₅N₅O₁	C₄H₄N₂O₂	C₁₃H₁₄N₁₀O₄
CGC	C₄H₅N₃O₁	C₅H₅N₅O₁	C₄H₅N₃O₁	C₁₃H₁₅N₁₁O₃
CGC	C₄H₅N₃O₁	C₅H₅N₅O₁	C₄H₅N₃O₁	C₁₃H₁₅N₁₁O₃
CGA	C₄H₅N₃O₁	C₅H₅N₅O₁	C₅H₅N₅	C₁₄H₁₅N₁₃O₂
CGA	C₄H₅N₃O₁	C₅H₅N₅O₁	C₅H₅N₅	C₁₄H₁₅N₁₃O₂
CGG	C₄H₅N₃O₁	C₅H₅N₅O₁	C₅H₅N₅O₁	C₁₄H₁₅N₁₃O₃
CGG	C₄H₅N₃O₁	C₅H₅N₅O₁	C₅H₅N₅O₁	C₁₄H₁₅N₁₃O₃
AGA	C₅H₅N₅	C₅H₅N₅O₁	C₅H₅N₅	C₁₅H₁₅N₁₅O₁
AGA	C₅H₅N₅	C₅H₅N₅O₁	C₅H₅N₅	C₁₅H₁₅N₁₅O₁
AGG	C₅H₅N₅	C₅H₅N₅O₁	C₅H₅N₅O₁	C₁₅H₁₅N₁₅O₂
AGG	C₅H₅N₅	C₅H₅N₅O₁	C₅H₅N₅O₁	C₁₅H₁₅N₁₅O₂

Appendix QB5 Mut

This is an appendix to: Quantum Biology - 5

Mutant Codon-Molecule Atom Counts
(CM000663.2, Homo sapiens, Chromosome 1)
Codon	Amino Acid 1	Amino Acid 2	Amino Acid 3	Combined Atom Count
GGU	C₅H₅N₅O₁	C₅H₅N₅O₁	C₄H₄N₂O₂	C₁₄H₁₄N₁₂O₄
CCC	C₄H₅N₃O₁	C₄H₅N₃O₁	C₄H₅N₃O₁	C₁₂H₁₅N₉O₃
CAG	C₄H₅N₃O₁	C₅H₅N₅	C₅H₅N₅O₁	C₁₄H₁₅N₁₃O₂
GCC	C₅H₅N₅O₁	C₄H₅N₃O₁	C₄H₅N₃O₁	C₁₃H₁₅N₁₁O₃
GGC	C₅H₅N₅O₁	C₅H₅N₅O₁	C₄H₅N₃O₁	C₁₄H₁₅N₁₃O₃
CAA	C₄H₅N₃O₁	C₅H₅N₅	C₅H₅N₅	C₁₄H₁₅N₁₃O₁
GAA	C₅H₅N₅O₁	C₅H₅N₅	C₅H₅N₅	C₁₅H₁₅N₁₅O₁
UGG	C₄H₄N₂O₂	C₅H₅N₅O₁	C₅H₅N₅O₁	C₁₄H₁₄N₁₂O₄
ACC	C₅H₅N₅	C₄H₅N₃O₁	C₄H₅N₃O₁	C₁₃H₁₅N₁₁O₂
UGU	C₄H₄N₂O₂	C₅H₅N₅O₁	C₄H₄N₂O₂	C₁₃H₁₃N₉O₅
GCA	C₅H₅N₅O₁	C₄H₅N₃O₁	C₅H₅N₅	C₁₄H₁₅N₁₃O₂
CCA	C₄H₅N₃O₁	C₄H₅N₃O₁	C₅H₅N₅	C₁₃H₁₅N₁₁O₂
GCU	C₅H₅N₅O₁	C₄H₅N₃O₁	C₄H₄N₂O₂	C₁₃H₁₄N₁₀O₄
GUC	C₅H₅N₅O₁	C₄H₄N₂O₂	C₄H₅N₃O₁	C₁₃H₁₄N₁₀O₄
GAC	C₅H₅N₅O₁	C₅H₅N₅	C₄H₅N₃O₁	C₁₄H₁₅N₁₃O₂
GGG	C₅H₅N₅O₁	C₅H₅N₅O₁	C₅H₅N₅O₁	C₁₅H₁₅N₁₅O₃
GAG	C₅H₅N₅O₁	C₅H₅N₅	C₅H₅N₅O₁	C₁₅H₁₅N₁₅O₂
AAG	C₅H₅N₅	C₅H₅N₅	C₅H₅N₅O₁	C₁₅H₁₅N₁₅O₁
GGA	C₅H₅N₅O₁	C₅H₅N₅O₁	C₅H₅N₅	C₁₅H₁₅N₁₅O₂
UGC	C₄H₄N₂O₂	C₅H₅N₅O₁	C₄H₅N₃O₁	C₁₃H₁₄N₁₀O₄
CCU	C₄H₅N₃O₁	C₄H₅N₃O₁	C₄H₄N₂O₂	C₁₂H₁₄N₈O₄
GAU	C₅H₅N₅O₁	C₅H₅N₅	C₄H₄N₂O₂	C₁₄H₁₄N₁₂O₃
ACA	C₅H₅N₅	C₄H₅N₃O₁	C₅H₅N₅	C₁₄H₁₅N₁₃O₁
GUG	C₅H₅N₅O₁	C₄H₄N₂O₂	C₅H₅N₅O₁	C₁₄H₁₄N₁₂O₄
AAA	C₅H₅N₅	C₅H₅N₅	C₅H₅N₅	C₁₅H₁₅N₁₅O₀
AUU	C₅H₅N₅	C₄H₄N₂O₂	C₄H₄N₂O₂	C₁₃H₁₃N₉O₄
CCG	C₄H₅N₃O₁	C₄H₅N₃O₁	C₅H₅N₅O₁	C₁₃H₁₅N₁₁O₃
AUC	C₅H₅N₅	C₄H₄N₂O₂	C₄H₅N₃O₁	C₁₃H₁₄N₁₀O₃
UUC	C₄H₄N₂O₂	C₄H₄N₂O₂	C₄H₅N₃O₁	C₁₂H₁₃N₇O₅
GCG	C₅H₅N₅O₁	C₄H₅N₃O₁	C₅H₅N₅O₁	C₁₄H₁₅N₁₃O₃
ACU	C₅H₅N₅	C₄H₅N₃O₁	C₄H₄N₂O₂	C₁₃H₁₄N₁₀O₃
ACG	C₅H₅N₅	C₄H₅N₃O₁	C₅H₅N₅O₁	C₁₄H₁₅N₁₃O₂
UUU	C₄H₄N₂O₂	C₄H₄N₂O₂	C₄H₄N₂O₂	C₁₂H₁₂N₆O₆

Tuesday, April 12, 2022

It's In The GenBank - 5

Fig. 1 The Real RNA

I have been wondering why the GenBank's "FASTA" format is more preferred than the GBFF format.

I recently decided to use it as the main format for translating DNA segments into mRNA, codons, and amino acids.

Among other things, the GBFF format has errors in its references to genes (e.g. "gene 687..3158"), mRNA, and CDS.

Many of those indicators don't match up with the results when using the technique in the video I displayed below.

That very informative video displayed at the end of this post indicates how translation is done by hand.

Translating by hand is ok for a few lines of DNA, but there are an incredible number of lines in human chromosomes.

It would take an inordinate amount of time to translate them by hand (for example: 22 human chromosomes from a GenBank FASTA file is 2.8 gigabytes in size; the GBFF is even larger).

So, I engineered some software to prepare for "the ribosome job":

"Within all cells, the translation machinery resides within a specialized organelle called the ribosome. In eukaryotes, mature mRNA molecules must leave the nucleus and travel to the cytoplasm, where the ribosomes are located. On the other hand, in prokaryotic organisms, ribosomes can attach to mRNA while it is still being transcribed. In this situation, translation begins at the 5' end of the mRNA while the 3' end is still attached to DNA.

In all types of cells, the ribosome is composed of two subunits: the large (50S) subunit and the small (30S) subunit (S, for svedberg unit, is a measure of sedimentation velocity and, therefore, mass). Each subunit exists separately in the cytoplasm, but the two join together on the mRNA molecule. The ribosomal subunits contain proteins and specialized RNA molecules—specifically, ribosomal RNA (rRNA) and transfer RNA (tRNA). The tRNA molecules are adaptor molecules—they have one end that can read the triplet code in the mRNA through complementary base-pairing, and another end that attaches to a specific amino acid (Chapeville et al., 1962; Grunberger et al., 1969). The idea that tRNA was an adaptor molecule was first proposed by Francis Crick, co-discoverer of DNA structure, who did much of the key work in deciphering the genetic code (Crick, 1958).

Fig. 2 Genetic Code Perspective

Within the ribosome, the mRNA and aminoacyl-tRNA complexes are held together closely, which facilitates base-pairing. The rRNA catalyzes the attachment of each new amino acid to the growing chain."

(Nature, Translation: DNA to mRNA to Protein). I will point out one atomic aspect of the process later on.

I has to do with thymine being converted to uracil during the process:

"Like DNA, RNA is a linear polymer made of four different types of nucleotide subunits linked together by phosphodiester bonds (Figure 6-4). It differs from DNA chemically in two respects: (1) the nucleotides in RNA are ribonucleotides—that is, they contain the sugar ribose (hence the name ribonucleic acid) rather than deoxyribose; (2) although, like DNA, RNA contains the bases adenine (A), guanine (G), and cytosine (C), it contains the base uracil (U) instead of the thymine (T) in DNA. Since U, like T, can base-pair by hydrogen-bonding with A (Figure 6-5), the complementary base-pairing properties described for DNA in Chapters 4 and 5 apply also to RNA (in RNA, G pairs with C, and A pairs with U). It is not uncommon, however, to find other types of base pairs in RNA: for example, G pairing with U occasionally."

(Molecular Biology of the Cell. 4th edition). As I have written previously, the GenBank nomenclature does not distinguish between thymine ("T") and uracil ("U") (It's In The GenBank - 4).

I added a feature to the software I engineered, which is the addition of the atomic nomenclature of the codons.

Anyway, the next step after converting the 'five-prime' (5`) and 'three-prime' (3`) segments (strands) into mRNA format codons ("U" for uracil, see Fig. 1 above).

Those codons can then be used to determine the amino acid using the Genetic Code (see Fig. 2).

As I said, I added the atomic nomenclature).

Here is an example from the software's analysis of chromosome 1:

output-files/chr1_fna.html

processing: >CM000663.2 Homo sapiens chromosome 1:
GRCh38 reference primary assembly

strand_53 ok, strand_35 ok, mRNAstrand ok

valid codons (codons in the Genetic Code):

AUG,CAU,CUU,CAC,UCC,CUC,AGA,AGG,AGG,CUC,
AGA,UGA,AGG,CUG,UCA,AGU,UAG,AGC,CUG,UCC,
AGC,AGA,CUG,AGA,AGA,UGA,CUG,CUC,CUA,CUG,
UUG,AGG,CUG,UAA,
AUG,CAU,CUU,CAC,UCC,CUC,AGA,AGG,AGG,CUC,

AGA,AGG,CUG,UCA,AGU,AGC,CUG,UCC,AGC,AGA,

CUG,AGA,AGA,CUG,CUC,CUA,CUG,UUG,AGG,CUG,
AUG,CUG,AGG,AGG,AGA,AGU,AGU,CGU,AGA,AAU,
AUG,CAU,UGA,CUU,AGG,CUC,UCU,CAU,AGG,AGG,
CUU,CUC,CUG,CAC,CUU,CUU,AGA,AGC,CAU,CUG,
CUA,CUG,CUA,UAA,
AUG,CUG,AGG,AGG,AGA,AGU,AGU,CGU,AGA,AAU,
AUG,CAU,CUU,AGG,CUC,UCU,CAU,AGG,AGG,CUU,

CUC,CUG,CAC,CUU,CUU,AGA,AGC,CAU,CUG,CUA,

CUG,CUA,
AUG,CUG,UUG,UCA,UCU,UCC,UGA,UUG,CUC,
AUG,CUG,UUG,UCA,UCU,UCC,UUG,CUC,

amino acids:

MHLHSLRRRLR
RLSS
SLSSRLRR
LLLLLRL
MLRRRSSRRNMH
LRLSHRRLLLHLLRSHLLLL
MLLSSS
LL

molecule content (no duplicates):

(MET) C₅H₁₁NO₂S, (HIS) C₆H₉N₃O₂, (SER) C₃H₇NO₃,

(ARG) C₆H₁₄N₄O₂, (ASN) C₄H₈N₂O₃, (LEU) C₆H₁₃NO₂,

(TYR) C₉H₁₁NO₃

codon mutations (no duplicates):

GGU,CCC,CAG,GCC,GGC,CAA,GAA,UGG,ACC,UGU,
GCA,CCA,GCU,GUC,GAC,GGG,GAG,AAG,GGA,UGC,
CCU,GAU,ACA,GUG,AAA,AUU,CCG,AUC,UUC,GCG,

ACU,ACG,UUU,

(Excerpts from chromosome 1). The molecular descriptions in the "molecule content" section are in red letters and digits ; the abbreviations mean: (MET="Methionine", HIS="Histidine", SER="Serine", ARG="Arginine", ASN="Asparagine", LEU="Leucine", and TYR="Tyrosine").

In the "valid codons" section the start and stop codons are in bold letters.

Notice that the software I engineered separates the mutant codons and places them into the "codon mutations" section of the analysis (only one copy of each mutant is displayed).

Proton tunneling has been a suspect in mutant phenomena for some time, so the "codon mutations" section is not useless (The Doll As Metaphor - 4).

More discussion of that and other issues will be forthcoming in future posts in this Dredd Blog series.