Saturday, July 8, 2023

On The Origin of The Containment Entity - 12

Fig. 1 Graph of Fig. 2 Values

In the previous post of this series (On The Origin of The Containment Entity - 11) sections of the human chromosome genome were analyzed.

The analysis was based on the sections between a promoter and a terminator (thus, not as many codons were used as are used in today's post).

Today's Appendix CE-12 contains full analysis tables using the same Benford's Law tabulation logic.

The totals from 25 human chromosome genome files are summed up in Fig. 2 below and in Appendix CE-12 to today's post.

Further, the graph at Fig. 1 details how closely these graphed values in Fig. 1 (black line) match the graph line of Benford's Law (red line).

In other words the hundreds of millions of values analyzed and summed up in the HTML tables in Appendix CE-12 are shown to follow the Benford's Law graph line quite closely.

But this result is a summation of all files.

If only one or a random few chromosome files is/are used, various results will occur.

However the application of Benford's Law must be applied to all of them to obtain a comprehensive consideration of the phenomenon.

The individual chromosomes featured in the appendix are detailed at NIH's Homo sapiens (human) page, with links to each chromosome (i.e. 1-22, X,Y, and MT).


Fig. 2

Benford's Law applied to
All Human Chromosomes (1-22,X,Y, and MT)
:
(Files processed: 25)
First
Digit
1st Digit
Percent
Benford
Percent
Diff
1 32.51 30.10 2.41
2 13.00 17.60 4.60
3 10.73 12.50 1.77
4 8.81 9.70 0.89
5 8.69 7.90 0.79
6 7.73 6.70 1.03
7 7.49 5.80 1.69
8 5.99 5.10 0.89
9 5.04 4.60 0.44

It is important for readers to remember that Benford's Law is about the single digit that begins a number (so, 10, 100, and 1000 are equal in terms of first digit law, as are 20, 200, and 2000).

It isn't about magnitude, it is about digits 1,2,3,4,5,6,7,8, and 9 as the first digit of a large group of numbers, and the ratio patterns of their occurrences.

So, how does that relate to codons in the human chromosomes?

In the application today, as in the last post of this series, it is about the first digit of the location where a codon is found in that genome.

Some of the chromosomes are ~300 million nucleotides long in the files used, divided by 3 that means some of the chromosomes are ~100 million codons long, (but most are smaller than that).

There are 64 different 'official' codons, so any one codon (e.g. "AAA") could be located at many different positions within that large genome.

And all 64 codons are searched for within that ~100 million codon crowd (made up of ~300 million nucleotides ('A', 'C', 'G', or 'T' in groups of three).

The HTML table for chromosome 1 ('chr1') in Appendix CE-12 indicates that "219,353,678" codon locations were observed while parsing the genome for those 64 unique codons.

Of that "219,353,678" figure "105,741,138" were at numeric positions within the genome location that numerically began with the digit '1' (e.g. positions '1', '10', '100', '1,000,000', etc.).

The total averages of those numbers shown in Fig. 2 and the graph of them in Fig. 1 indicate that their location patterns were surprisingly close to Benford's Law values.

I say surprising because the total number of nucleotides in the 25 files is about three billion, which means that the total number of codons is about one billion.

That the first digit of the locations of one billion codons within the genome matches the Benford Law pattern is surprising (Benford’s Law Explained with Examples).

The next post in this series is here, the previous post in this series is here.

No comments:

Post a Comment