Saturday, November 11, 2023

On The Origin Of A Genetic Constant - 9

Ye olde Snail's Pace

If you haven't read the previous posts in this series (On The Origin Of A Genetic Constant, 1, 2, 3, 4, 5, 6, 7, 8) then this post will be a primer.

To determine the genetic constants for DNA and RNA we begin with the basic Chemical Formula concept: 

"In chemistry, a chemical formula is a way of presenting information about the chemical proportions of atoms that constitute a particular chemical compound or molecule, using chemical element symbols, numbers ... [example glucose:] C6H12O6 (12 hydrogen atoms, six carbon and oxygen atoms)."

(Wikipedia, Chemical Formula). Now we apply it to the molecules of DNA and RNA:

"I. Where Did The ~(32/35/25/6) Originate?

That is a fair question, so this post will answer the question and will provide the updated exact values and where "~(32/35/25/6)" originated.

Let's start with the nucleotide, atom, and their quantities that are in BOTH DNA and RNA:

Nucleotides(ACG), atoms, and counts in both DNA and RNA:

A: (Adenine) 15 atoms (5 carbon, 5 hydrogen, 5 nitrogen, 0 oxygen)

C: (Cytocine) 13 atoms (4 carbon, 5 hydrogen, 3 nitrogen, 1 oxygen)

G: (Guanine) 16 atoms (5 carbon, 5 hydrogen, 5 nitrogen, 1 oxygen)

44 total atoms (14 carbon, 15 hydrogen, 13 nitrogen, 2 oxygen)

Percentages:
(carbon 31.8181, hydrogen 34.0909, nitrogen 29.5455, oxygen 4.5455)


Now let's add the missing ingredient ("T") needed to make the nucleotide group complete for DNA:

Additional nucleotide(T), atoms, and count (only in DNA):

T: (Thymine) 15 atoms, (5 carbon, 6 hydrogen, 2 nitrogen, 2 oxygen)
59 total atoms (19 carbon, 21 hydrogen, 15 nitrogen, 4 oxygen)

Percentages:
(carbon 32.2034, hydrogen 35.5932, nitrogen 25.4237, oxygen 6.7797)

That is the source for the DNA (ACGT) ~(32/35/25/6) genetic constant.

It is 19, 21, 15, and 4 divided by 59 (x 100.0) which determines those percentages of those atoms in the genomes of DNA."

(On The Origin Of A Genetic Constant - 5). To top it off, we note that RNA does not have "T" thymine, instead it has "U" uracil:

"Now let's add the missing ingredient ("U") needed to make the nucleotide group complete for RNA:

Additional nucleotide(U), atoms, and count (only in RNA):
U: (Uracil) 12 atoms, (4 carbon, 4 hydrogen, 2 nitrogen, 2 oxygen)
56 total atoms (18 carbon, 19 hydrogen, 15 nitrogen, 4 oxygen)

Percentages:
(carbon 32.1429, hydrogen 33.9286, nitrogen 26.7857, oxygen 7.1429)

That is the source for the RNA (ACGU) ~(32/33/26/7) genetic constant.

It is 18, 19, 15, and 4 divided by 56 (x 100.0) which determines those percentages of those atoms in the genomes of RNA."

(On The Origin Of A Genetic Constant - 6).

You may be wondering where the projected constant "~(32/35/25/6)" of DNA nucleotide carbon, hydrogen, nitrogen, and oxygen atoms came from, so here are the relevant Chemical Formulas:

Screen shots
from Wikipedia links above

The calculations concerning the atomic genetic constant are accomplished by simple arithmetic (no algebra or high math or calculus is necessary).

To apply it to a GenBank single genome file (.gbff or .fasta): 

1) load the ACGT nucleotides (Origin section of a GenBank .gbff file, or the second line through the end of the file for a GenBank FASTA file); 2) determine the total individual nucleotide count of ACG and T in that genome; 3) multiply those nucleotide counts by the individual carbon('C'), hydrogen ('H'), nitrogen('H') and oxygen('O') atom counts specified in the Chemical Formula for each nucleotide type (ACGT), example: 'A' nucleotide count times carbon count (5), 'C' nucleotide count times carbon count (4), etc.;  4) add up (sum) the 'C' 'H' 'N' and 'O' quantities for each individual atom type ; 5)  add up (sum) all atoms ; and 6) divide each atom category  count ('C', 'H', 'N' and 'O') by the sum of all atoms, then multiply by 100 to derive the percentage of each atom type in that genome:

Example genome totals:

Atom Atom Count Percent
Carbon 1,071,062 32.3550
Hydrogen 1,178,966 35.6146
Nitrogen 847,036 25.5875
Oxygen 213,284 6.4429
Totals 3,310,348 ~100.00

Total carbon atoms (1,071,062 divided by total genome atoms 3,310,348) = 0.32355 (times 100 = 32.3550%) and so forth.

Today's appendix gives an example of how the numbers come about (Example Appendix).

If you like, enjoy this arithmetic applied to over a million genomes in the previous appendices to the previous post (Appendix Low, Appendix High, and Appendix Average).

The next post in this series is here, the previous post in this series is here.


Once upon a time a polymath visited Harvard:


1 comment:

  1. "DNA sample contamination is a serious problem in DNA sequencing studies and may result in systematic genotype misclassification and false positive associations. Although methods exist to detect and filter out cross-species contamination, few methods to detect within-species sample contamination are available." (Link).

    The ~(32/35/25/6) check should help.

    ReplyDelete