Tuesday, October 24, 2023

On The Origin Of A Genetic Constant - 3

DNA Atoms

In this Dredd Blog series a lot of appendices contain HTML tables showing examples of the ~(32/35/25/6) genetic constant concerning the percentages of Carbon, Hydrogen, Nitrogen, and Oxygen atoms in DNA (On The Origin Of A Genetic Constant, 2).

I suspect some readers might ask "but is that a large enough sample to substantiate the ~(32/35/25/6) genetic constant hypothesis?"

Good question.

So, today I present a report that shows the results of analyzing 1,052,789 GenBank genomes instead of a measly 30,000 or so in some of the previous posts of this series.

Here is the report:

GenBank Flat Files Genome Analysis Report

...after processing 100,000 genomes:
variation count @ <1.0% = 368,476
variation count @ <2.0% = 30,784
variation count @ <3.0% = 714
variation count @ <4.0% = 9
variation count @ >=4.0% = 1

...after processing 200,000 genomes:
variation count @ <1.0% = 733,172
variation count @ <2.0% = 62,030
variation count @ <3.0% = 4,717
variation count @ <4.0% = 54
variation count @ >=4.0% = 2

...after processing 300,000 genomes:
variation count @ <1.0% = 1,107,232
variation count @ <2.0% = 86,680
variation count @ <3.0% = 5,956
variation count @ <4.0% = 94
variation count @ >=4.0% = 2

...after processing 400,000 genomes:
variation count @ <1.0% = 1,484,226
variation count @ <2.0% = 108,655
variation count @ <3.0% = 6,754
variation count @ <4.0% = 304
variation count @ >=4.0% = 15

...after processing 500,000 genomes:
variation count @ <1.0% = 1,853,186
variation count @ <2.0% = 139,569
variation count @ <3.0% = 6,862
variation count @ <4.0% = 315
variation count @ >=4.0% = 15

...after processing 600,000 genomes:
variation count @ <1.0% = 2,195,693
variation count @ <2.0% = 197,025
variation count @ <3.0% = 6,888
variation count @ <4.0% = 315
variation count @ >=4.0% = 18

...after processing 700,000 genomes:
variation count @ <1.0% = 2,541,536
variation count @ <2.0% = 248,706
variation count @ <3.0% = 9,229
variation count @ <4.0% = 426
variation count @ >=4.0% = 28

...after processing 800,000 genomes:
variation count @ <1.0% = 2,924,960
variation count @ <2.0% = 265,213
variation count @ <3.0% = 9,272
variation count @ <4.0% = 429
variation count @ >=4.0% = 31

...after processing 900,000 genomes:
variation count @ <1.0% = 3,304,749
variation count @ <2.0% = 285,295
variation count @ <3.0% = 9,387
variation count @ <4.0% = 430
variation count @ >=4.0% = 32

...after processing 1,000,000 genomes:
variation count @ <1.0% = 3,699,202
variation count @ <2.0% = 290,735
variation count @ <3.0% = 9,477
variation count @ <4.0% = 438
variation count @ >=4.0% = 36

Total processed: 1,052,789 genomes!
variation count @ <1.0% = 3,903,413 (92.6949%)
variation count @ <2.0% = 297,538 (7.0657%)
variation count @ <3.0% = 9,591 (0.2278%)
variation count @ <4.0% = 454 (0.0108%)
variation count @ >=4.0% = 37 (0.0009%)

As you can see, the software proceeded through the "DNA_html_tables" on my SQL Server while analyzing the percentages of Carbon, Hydrogen, Nitrogen, and Oxygen atoms in DNA genomes that had been downloaded from the Genbank FTP site.

The vast majority of variations from the ~(32/35/25/6) genetic constant in the report are under 1% variation  ("92.6949%"), second place is under 2% variation ("7.0657%") which totals to 99.7606 (1,052,789 x 99.7606) = 1,050,268 genomes.

Thus, in 1,052,789 genomes there is very little variation in DNA percentages in terms of the ~(32/35/25/6) genetic constant.

I consider these results to provide added support to the ~(32/35/25/6) genetic constant hypothesis.

The next post in this series is here, previous post in this series is here.



No comments:

Post a Comment