Friday, April 15, 2022

Quantum Biology - 5

Dredd Blog
"Parts is Parts"

I. Background

This series deals with something a lot of biologists, including microbiologists, don't like to consider (Quantum Biology, 2, 3, 4).

More specifically, this series of Dredd Blog posts deals with the DNA and RNA nomenclature used in GenBank and other genomic repositories which fail to properly distinguish between RNA and DNA to the detriment of our understanding (It's In The GenBank, 2, 3, 4, 5).

For example, the professor in the first video (in the first two minutes) at the bottom of the page pooh-poohs the difference between the two (the rest of the video is worthy).

Today I want to criticize that "parts is parts" approach to DNA and RNA molecules, like I would if someone told me that carbon 12, 13, and 14 aren't different enough to consider a big deal (DOE).

Atoms matter!

II. Appendices

The appendices to today's post show mutant codons (Appendix One) and the differences in atom counts (Appendix Two) in one nucleotide of DNA (T,  thymine) compared to the relevant nucleotide of RNA (U, uracil).

The codons and amino acids related to mRNA (mRNA codons have "U" instead of "T") in the lines of Appendix Two follow (are on the following line) the codons of the lines of DNA codons.

This (Appendix Two) shows that the atom counts in DNA 'T' do not match the atom counts of RNA 'U' ('T' parts are not 'U' parts).

Considerable skill is required to move atoms out of or into a molecule, yet that happens during DNA -> mRNA transcription (see videos below).

As it were, one carbon atom and two hydrogen atoms are removed from 'T' and placed into 'U' during codon transcription processing.

As you can see in those appendices, this is done in the cells of eucaryotes millions and millions of times daily.

III. Closing Comments

The common practice of some scientists these days is to play dolls with microbes (The Doll As Metaphor, 2, 3, 4).

I say that because way too many scientists support the overuse of antibiotics and other chemicals which kill so many beneficial microbes that they pollute the microbiome with chimeric fogs:

"Chimeras arise when a cell undergoes mutation. This mutation may be spontaneous or it may be induced by irradiation or treatment with chemical mutagens. If the cell which mutates is located near the crest of the apical dome, then all other cells which are produced by division from it will also be the mutated type. The result will be cells of different genotypes growing adjacent in a plant tissue, the definition of a chimera."

(Origin of Chimeras, cf. Soil Chemical and Microbiological Properties Are Changed, Some Of My Best Friends Are Germs - 2, It's In The GenBank - 4). It is not something to toy around with.

The next post in this series is here, the previous post in this series is here.




Appendix QB5 DNARNA

This is an appendix to: Quantum Biology - 5



Valid (DNA/RNA)Codon-Molecule Atom Counts
Codon Amino Acid 1 Amino Acid 2 Amino Acid 3 Combined
Atom Count
ATT C5H5N5 C5H6N2O2 C5H6N2O2 C15H17N9O4
AUU C5H5N5 C4H4N2O2 C4H4N2O2 C13H13N9O4
ATC C5H5N5 C5H6N2O2 C4H5N3O1 C14H16N10O3
AUC C5H5N5 C4H4N2O2 C4H5N3O1 C13H14N10O3
ATA C5H5N5 C5H6N2O2 C5H5N5 C15H16N12O2
AUA C5H5N5 C4H4N2O2 C5H5N5 C14H14N12O2
CTT C4H5N3O1 C5H6N2O2 C5H6N2O2 C14H17N7O5
CUU C4H5N3O1 C4H4N2O2 C4H4N2O2 C12H13N7O5
CTC C4H5N3O1 C5H6N2O2 C4H5N3O1 C13H16N8O4
CUC C4H5N3O1 C4H4N2O2 C4H5N3O1 C12H14N8O4
CTA C4H5N3O1 C5H6N2O2 C5H5N5 C14H16N10O3
CUA C4H5N3O1 C4H4N2O2 C5H5N5 C13H14N10O3
CTG C4H5N3O1 C5H6N2O2 C5H5N5O1 C14H16N10O4
CUG C4H5N3O1 C4H4N2O2 C5H5N5O1 C13H14N10O4
TTA C5H6N2O2 C5H6N2O2 C5H5N5 C15H17N9O4
UUA C4H4N2O2 C4H4N2O2 C5H5N5 C13H13N9O4
TTG C5H6N2O2 C5H6N2O2 C5H5N5O1 C15H17N9O5
UUG C4H4N2O2 C4H4N2O2 C5H5N5O1 C13H13N9O5
GTT C5H5N5O1 C5H6N2O2 C5H6N2O2 C15H17N9O5
GUU C5H5N5O1 C4H4N2O2 C4H4N2O2 C13H13N9O5
GTC C5H5N5O1 C5H6N2O2 C4H5N3O1 C14H16N10O4
GUC C5H5N5O1 C4H4N2O2 C4H5N3O1 C13H14N10O4
GTA C5H5N5O1 C5H6N2O2 C5H5N5 C15H16N12O3
GUA C5H5N5O1 C4H4N2O2 C5H5N5 C14H14N12O3
GTG C5H5N5O1 C5H6N2O2 C5H5N5O1 C15H16N12O4
GUG C5H5N5O1 C4H4N2O2 C5H5N5O1 C14H14N12O4
TTT C5H6N2O2 C5H6N2O2 C5H6N2O2 C15H18N6O6
UUU C4H4N2O2 C4H4N2O2 C4H4N2O2 C12H12N6O6
TTC C5H6N2O2 C5H6N2O2 C4H5N3O1 C14H17N7O5
UUC C4H4N2O2 C4H4N2O2 C4H5N3O1 C12H13N7O5
ATG C5H5N5 C5H6N2O2 C5H5N5O1 C15H16N12O3
AUG C5H5N5 C4H4N2O2 C5H5N5O1 C14H14N12O3
TGT C5H6N2O2 C5H5N5O1 C5H6N2O2 C15H17N9O5
UGU C4H4N2O2 C5H5N5O1 C4H4N2O2 C13H13N9O5
TGC C5H6N2O2 C5H5N5O1 C4H5N3O1 C14H16N10O4
UGC C4H4N2O2 C5H5N5O1 C4H5N3O1 C13H14N10O4
GCT C5H5N5O1 C4H5N3O1 C5H6N2O2 C14H16N10O4
GCU C5H5N5O1 C4H5N3O1 C4H4N2O2 C13H14N10O4
GCC C5H5N5O1 C4H5N3O1 C4H5N3O1 C13H15N11O3
GCC C5H5N5O1 C4H5N3O1 C4H5N3O1 C13H15N11O3
GCA C5H5N5O1 C4H5N3O1 C5H5N5 C14H15N13O2
GCA C5H5N5O1 C4H5N3O1 C5H5N5 C14H15N13O2
GCG C5H5N5O1 C4H5N3O1 C5H5N5O1 C14H15N13O3
GCG C5H5N5O1 C4H5N3O1 C5H5N5O1 C14H15N13O3
GGT C5H5N5O1 C5H5N5O1 C5H6N2O2 C15H16N12O4
GGU C5H5N5O1 C5H5N5O1 C4H4N2O2 C14H14N12O4
GGC C5H5N5O1 C5H5N5O1 C4H5N3O1 C14H15N13O3
GGC C5H5N5O1 C5H5N5O1 C4H5N3O1 C14H15N13O3
GGA C5H5N5O1 C5H5N5O1 C5H5N5 C15H15N15O2
GGA C5H5N5O1 C5H5N5O1 C5H5N5 C15H15N15O2
GGG C5H5N5O1 C5H5N5O1 C5H5N5O1 C15H15N15O3
GGG C5H5N5O1 C5H5N5O1 C5H5N5O1 C15H15N15O3
CCT C4H5N3O1 C4H5N3O1 C5H6N2O2 C13H16N8O4
CCU C4H5N3O1 C4H5N3O1 C4H4N2O2 C12H14N8O4
CCC C4H5N3O1 C4H5N3O1 C4H5N3O1 C12H15N9O3
CCC C4H5N3O1 C4H5N3O1 C4H5N3O1 C12H15N9O3
CCA C4H5N3O1 C4H5N3O1 C5H5N5 C13H15N11O2
CCA C4H5N3O1 C4H5N3O1 C5H5N5 C13H15N11O2
CCG C4H5N3O1 C4H5N3O1 C5H5N5O1 C13H15N11O3
CCG C4H5N3O1 C4H5N3O1 C5H5N5O1 C13H15N11O3
ACT C5H5N5 C4H5N3O1 C5H6N2O2 C14H16N10O3
ACU C5H5N5 C4H5N3O1 C4H4N2O2 C13H14N10O3
ACC C5H5N5 C4H5N3O1 C4H5N3O1 C13H15N11O2
ACC C5H5N5 C4H5N3O1 C4H5N3O1 C13H15N11O2
ACA C5H5N5 C4H5N3O1 C5H5N5 C14H15N13O1
ACA C5H5N5 C4H5N3O1 C5H5N5 C14H15N13O1
ACG C5H5N5 C4H5N3O1 C5H5N5O1 C14H15N13O2
ACG C5H5N5 C4H5N3O1 C5H5N5O1 C14H15N13O2
TCT C5H6N2O2 C4H5N3O1 C5H6N2O2 C14H17N7O5
UCU C4H4N2O2 C4H5N3O1 C4H4N2O2 C12H13N7O5
TCC C5H6N2O2 C4H5N3O1 C4H5N3O1 C13H16N8O4
UCC C4H4N2O2 C4H5N3O1 C4H5N3O1 C12H14N8O4
TCA C5H6N2O2 C4H5N3O1 C5H5N5 C14H16N10O3
UCA C4H4N2O2 C4H5N3O1 C5H5N5 C13H14N10O3
TCG C5H6N2O2 C4H5N3O1 C5H5N5O1 C14H16N10O4
UCG C4H4N2O2 C4H5N3O1 C5H5N5O1 C13H14N10O4
AGT C5H5N5 C5H5N5O1 C5H6N2O2 C15H16N12O3
AGU C5H5N5 C5H5N5O1 C4H4N2O2 C14H14N12O3
AGC C5H5N5 C5H5N5O1 C4H5N3O1 C14H15N13O2
AGC C5H5N5 C5H5N5O1 C4H5N3O1 C14H15N13O2
TAT C5H6N2O2 C5H5N5 C5H6N2O2 C15H17N9O4
UAU C4H4N2O2 C5H5N5 C4H4N2O2 C13H13N9O4
TAC C5H6N2O2 C5H5N5 C4H5N3O1 C14H16N10O3
UAC C4H4N2O2 C5H5N5 C4H5N3O1 C13H14N10O3
TGG C5H6N2O2 C5H5N5O1 C5H5N5O1 C15H16N12O4
UGG C4H4N2O2 C5H5N5O1 C5H5N5O1 C14H14N12O4
CAA C4H5N3O1 C5H5N5 C5H5N5 C14H15N13O1
CAA C4H5N3O1 C5H5N5 C5H5N5 C14H15N13O1
CAG C4H5N3O1 C5H5N5 C5H5N5O1 C14H15N13O2
CAG C4H5N3O1 C5H5N5 C5H5N5O1 C14H15N13O2
AAT C5H5N5 C5H5N5 C5H6N2O2 C15H16N12O2
AAU C5H5N5 C5H5N5 C4H4N2O2 C14H14N12O2
AAC C5H5N5 C5H5N5 C4H5N3O1 C14H15N13O1
AAC C5H5N5 C5H5N5 C4H5N3O1 C14H15N13O1
CAT C4H5N3O1 C5H5N5 C5H6N2O2 C14H16N10O3
CAU C4H5N3O1 C5H5N5 C4H4N2O2 C13H14N10O3
CAC C4H5N3O1 C5H5N5 C4H5N3O1 C13H15N11O2
CAC C4H5N3O1 C5H5N5 C4H5N3O1 C13H15N11O2
GAA C5H5N5O1 C5H5N5 C5H5N5 C15H15N15O1
GAA C5H5N5O1 C5H5N5 C5H5N5 C15H15N15O1
GAG C5H5N5O1 C5H5N5 C5H5N5O1 C15H15N15O2
GAG C5H5N5O1 C5H5N5 C5H5N5O1 C15H15N15O2
GAT C5H5N5O1 C5H5N5 C5H6N2O2 C15H16N12O3
GAU C5H5N5O1 C5H5N5 C4H4N2O2 C14H14N12O3
GAC C5H5N5O1 C5H5N5 C4H5N3O1 C14H15N13O2
GAC C5H5N5O1 C5H5N5 C4H5N3O1 C14H15N13O2
AAA C5H5N5 C5H5N5 C5H5N5 C15H15N15O0
AAA C5H5N5 C5H5N5 C5H5N5 C15H15N15O0
AAG C5H5N5 C5H5N5 C5H5N5O1 C15H15N15O1
AAG C5H5N5 C5H5N5 C5H5N5O1 C15H15N15O1
CGT C4H5N3O1 C5H5N5O1 C5H6N2O2 C14H16N10O4
CGU C4H5N3O1 C5H5N5O1 C4H4N2O2 C13H14N10O4
CGC C4H5N3O1 C5H5N5O1 C4H5N3O1 C13H15N11O3
CGC C4H5N3O1 C5H5N5O1 C4H5N3O1 C13H15N11O3
CGA C4H5N3O1 C5H5N5O1 C5H5N5 C14H15N13O2
CGA C4H5N3O1 C5H5N5O1 C5H5N5 C14H15N13O2
CGG C4H5N3O1 C5H5N5O1 C5H5N5O1 C14H15N13O3
CGG C4H5N3O1 C5H5N5O1 C5H5N5O1 C14H15N13O3
AGA C5H5N5 C5H5N5O1 C5H5N5 C15H15N15O1
AGA C5H5N5 C5H5N5O1 C5H5N5 C15H15N15O1
AGG C5H5N5 C5H5N5O1 C5H5N5O1 C15H15N15O2
AGG C5H5N5 C5H5N5O1 C5H5N5O1 C15H15N15O2

Appendix QB5 Mut

This is an appendix to: Quantum Biology - 5



Mutant Codon-Molecule Atom Counts
(CM000663.2, Homo sapiens, Chromosome 1)
Codon Amino Acid 1 Amino Acid 2 Amino Acid 3 Combined
Atom Count
GGU C5H5N5O1 C5H5N5O1 C4H4N2O2 C14H14N12O4
CCC C4H5N3O1 C4H5N3O1 C4H5N3O1 C12H15N9O3
CAG C4H5N3O1 C5H5N5 C5H5N5O1 C14H15N13O2
GCC C5H5N5O1 C4H5N3O1 C4H5N3O1 C13H15N11O3
GGC C5H5N5O1 C5H5N5O1 C4H5N3O1 C14H15N13O3
CAA C4H5N3O1 C5H5N5 C5H5N5 C14H15N13O1
GAA C5H5N5O1 C5H5N5 C5H5N5 C15H15N15O1
UGG C4H4N2O2 C5H5N5O1 C5H5N5O1 C14H14N12O4
ACC C5H5N5 C4H5N3O1 C4H5N3O1 C13H15N11O2
UGU C4H4N2O2 C5H5N5O1 C4H4N2O2 C13H13N9O5
GCA C5H5N5O1 C4H5N3O1 C5H5N5 C14H15N13O2
CCA C4H5N3O1 C4H5N3O1 C5H5N5 C13H15N11O2
GCU C5H5N5O1 C4H5N3O1 C4H4N2O2 C13H14N10O4
GUC C5H5N5O1 C4H4N2O2 C4H5N3O1 C13H14N10O4
GAC C5H5N5O1 C5H5N5 C4H5N3O1 C14H15N13O2
GGG C5H5N5O1 C5H5N5O1 C5H5N5O1 C15H15N15O3
GAG C5H5N5O1 C5H5N5 C5H5N5O1 C15H15N15O2
AAG C5H5N5 C5H5N5 C5H5N5O1 C15H15N15O1
GGA C5H5N5O1 C5H5N5O1 C5H5N5 C15H15N15O2
UGC C4H4N2O2 C5H5N5O1 C4H5N3O1 C13H14N10O4
CCU C4H5N3O1 C4H5N3O1 C4H4N2O2 C12H14N8O4
GAU C5H5N5O1 C5H5N5 C4H4N2O2 C14H14N12O3
ACA C5H5N5 C4H5N3O1 C5H5N5 C14H15N13O1
GUG C5H5N5O1 C4H4N2O2 C5H5N5O1 C14H14N12O4
AAA C5H5N5 C5H5N5 C5H5N5 C15H15N15O0
AUU C5H5N5 C4H4N2O2 C4H4N2O2 C13H13N9O4
CCG C4H5N3O1 C4H5N3O1 C5H5N5O1 C13H15N11O3
AUC C5H5N5 C4H4N2O2 C4H5N3O1 C13H14N10O3
UUC C4H4N2O2 C4H4N2O2 C4H5N3O1 C12H13N7O5
GCG C5H5N5O1 C4H5N3O1 C5H5N5O1 C14H15N13O3
ACU C5H5N5 C4H5N3O1 C4H4N2O2 C13H14N10O3
ACG C5H5N5 C4H5N3O1 C5H5N5O1 C14H15N13O2
UUU C4H4N2O2 C4H4N2O2 C4H4N2O2 C12H12N6O6

Tuesday, April 12, 2022

It's In The GenBank - 5

Fig. 1 The Real RNA
I have been wondering why the GenBank's "FASTA" format is more preferred than the GBFF format.

I recently decided to use it as the main format for translating DNA segments into mRNA, codons, and amino acids.

Among other things, the GBFF format has errors in its references to genes (e.g. "gene 687..3158"), mRNA, and CDS.
 
Many of those indicators don't match up with the results when using the technique in the video I displayed below.

That very informative video displayed at the end of this post indicates how translation is done by hand.

Translating by hand is ok for a few lines of DNA, but there are an incredible number of lines in human chromosomes.
 
It would take an inordinate amount of time to translate them by hand (for example: 22 human chromosomes from a GenBank FASTA file is 2.8 gigabytes in size; the GBFF is even larger).

So, I engineered some software to prepare for "the ribosome job":

"Within all cells, the translation machinery resides within a specialized organelle called the ribosome. In eukaryotes, mature mRNA molecules must leave the nucleus and travel to the cytoplasm, where the ribosomes are located. On the other hand, in prokaryotic organisms, ribosomes can attach to mRNA while it is still being transcribed. In this situation, translation begins at the 5' end of the mRNA while the 3' end is still attached to DNA.

In all types of cells, the ribosome is composed of two subunits: the large (50S) subunit and the small (30S) subunit (S, for svedberg unit, is a measure of sedimentation velocity and, therefore, mass). Each subunit exists separately in the cytoplasm, but the two join together on the mRNA molecule. The ribosomal subunits contain proteins and specialized RNA molecules—specifically, ribosomal RNA (rRNA) and transfer RNA (tRNA). The tRNA molecules are adaptor molecules—they have one end that can read the triplet code in the mRNA through complementary base-pairing, and another end that attaches to a specific amino acid (Chapeville et al., 1962; Grunberger et al., 1969). The idea that tRNA was an adaptor molecule was first proposed by Francis Crick, co-discoverer of DNA structure, who did much of the key work in deciphering the genetic code (Crick, 1958).

Fig. 2 Genetic Code Perspective

Within the ribosome, the mRNA and aminoacyl-tRNA complexes are held together closely, which facilitates base-pairing. The rRNA catalyzes the attachment of each new amino acid to the growing chain."

(Nature, Translation: DNA to mRNA to Protein). I will point out one atomic aspect of the process later on.

I has to do with thymine being converted to uracil during the process:

"Like , is a linear made of four different types of subunits linked together by phosphodiester bonds (Figure 6-4). It differs from DNA chemically in two respects: (1) the nucleotides in RNA are ribonucleotides—that is, they contain the ribose (hence the name ribo) rather than deoxyribose; (2) although, like DNA, RNA contains the bases adenine (A), guanine (), and cytosine (C), it contains the uracil (U) instead of the thymine (T) in DNA. Since U, like T, can base-pair by hydrogen-bonding with A (Figure 6-5), the base-pairing properties described for DNA in Chapters 4 and 5 apply also to RNA (in RNA, G pairs with C, and A pairs with U). It is not uncommon, however, to find other types of base pairs in RNA: for example, G pairing with U occasionally."

(Molecular Biology of the Cell. 4th edition). As I have written previously, the GenBank nomenclature does not distinguish between thymine ("T") and uracil ("U") (It's In The GenBank - 4). 

I added a feature to the software I engineered, which is the addition of the atomic nomenclature of the codons.

Anyway, the next step after converting the 'five-prime' (5`) and 'three-prime' (3`) segments (strands) into mRNA format codons ("U" for uracil, see Fig. 1 above).

Those codons can then be used to determine the amino acid using the Genetic Code (see Fig. 2).

As I said, I added the atomic nomenclature).

Here is an example from the software's analysis of chromosome 1:

output-files/chr1_fna.html

processing: >CM000663.2 Homo sapiens chromosome 1:
GRCh38 reference primary assembly

strand_53 ok, strand_35 ok, mRNAstrand ok


valid codons (codons in the Genetic Code):

AUG,CAU,CUU,CAC,UCC,CUC,AGA,AGG,AGG,CUC,
AGA,UGA,AGG,CUG,UCA,AGU,UAG,AGC,CUG,UCC,
AGC,AGA,CUG,AGA,AGA,UGA,CUG,CUC,CUA,CUG,
UUG,AGG,CUG,UAA,
AUG,CAU,CUU,CAC,UCC,CUC,AGA,AGG,AGG,CUC,
AGA,AGG,CUG,UCA,AGU,AGC,CUG,UCC,AGC,AGA,
CUG,AGA,AGA,CUG,CUC,CUA,CUG,UUG,AGG,CUG,
AUG,CUG,AGG,AGG,AGA,AGU,AGU,CGU,AGA,AAU,
AUG,CAU,UGA,CUU,AGG,CUC,UCU,CAU,AGG,AGG,
CUU,CUC,CUG,CAC,CUU,CUU,AGA,AGC,CAU,CUG,
CUA,CUG,CUA,UAA,
AUG,CUG,AGG,AGG,AGA,AGU,AGU,CGU,AGA,AAU,
AUG,CAU,CUU,AGG,CUC,UCU,CAU,AGG,AGG,CUU,
CUC,CUG,CAC,CUU,CUU,AGA,AGC,CAU,CUG,CUA,
CUG,CUA,
AUG,CUG,UUG,UCA,UCU,UCC,UGA,UUG,CUC,
AUG,CUG,UUG,UCA,UCU,UCC,UUG,CUC,

amino acids:

MHLHSLRRRLR
RLSS
SLSSRLRR
LLLLLRL
MLRRRSSRRNMH
LRLSHRRLLLHLLRSHLLLL
MLLSSS
LL

molecule content (no duplicates):

(MET) C5H11NO2S, (HIS) C6H9N3O2, (SER) C3H7NO3
(ARG) C6H14N4O2, (ASN) C4H8N2O3, (LEU) C6H13NO2,
(TYR) C9H11NO3

codon mutations (no duplicates):

GGU,CCC,CAG,GCC,GGC,CAA,GAA,UGG,ACC,UGU,
GCA,CCA,GCU,GUC,GAC,GGG,GAG,AAG,GGA,UGC,
CCU,GAU,ACA,GUG,AAA,AUU,CCG,AUC,UUC,GCG,
ACU,ACG,UUU,

(Excerpts from chromosome 1). The molecular descriptions in the "molecule content" section are in red letters and digits ; the abbreviations mean: (MET="Methionine", HIS="Histidine", SER="Serine", ARG="Arginine", ASN="Asparagine", LEU="Leucine", and TYR="Tyrosine").

In the "valid codons" section the start and stop codons are in bold letters.

Notice that the software I engineered separates the mutant codons and places them into the "codon mutations" section of the analysis (only one copy of each mutant is displayed).

Proton tunneling has been a suspect in mutant phenomena for some time, so the "codon mutations" section is not useless (The Doll As Metaphor - 4).

More discussion of that and other issues will be forthcoming in future posts in this Dredd Blog series.

The next post in this series is here, the previous post in this series is here.


Translation and transcription video: