Dredd Blog: Chimera Detection

Monday, August 14, 2023

Chimera Detection - 3

I. Background

The atomic foundation of the genomes of every organism, human, animal, and virus is quite simple: Carbon (about 32%), Hydrogen (about 35%), Nitrogen (about 25%), and Oxygen (about 6%).

This surprise discovery was announced and discussed a bit in the previous posts where the subject was supposed to be just about chimera (Chimera Detection, 2).

It turns out that "chimera" as I call it, can also be called "mutations", "erroneous DNA collection" and/or "mis-identification", and even "damage done" to genomes by chemicals such as antibiotics, GMO practices, mass-production-of-animals-for-food, and the like (cf. Mutagenesis, Mutagen):

"Chimeric Sequences

Submitters will be contacted regarding sequences identified as chimeric. Chimeras are artifact sequences formed by two or more biological sequences incorrectly joined together. This often occurs during PCR reactions using mixed templates (i.e., uncultured environmental samples). Incomplete extensions during PCR allow subsequent PCR cycles to use a partially extended strand to bind to the template of a different, but similar, sequence. This partially extended strand then acts as a primer to extend and form a chimeric sequence. Once created, the chimeric sequence is then further amplified in subsequent cycles. The end result is a PCR artifact that does not represent a sequence that exists in nature.

Studies have estimated that as many as 30% of the sequences from mixed template environmental samples may be chimeric. While chimera formation is most common in mixed template amplifications, in practice it is also seen at lower frequency in supposedly pure cultures."

(GenBank, emphasis added). Techniques need to be developed that isolate the cause for a chimeric situation.

II. Ancient Homo Sapien Example

We are focusing on atoms that compose the particular genome being analyzed so the differences which indicate a potential chimeric event involve small percentage changes in the TataBox values.

In the following example the Full-Genome Table values (and the type of genome: DNA or RNA) are shown first, then the 'raw' Tatabox Table values (all nucleotides whether 'in-frame' or not 'in-frame') followed by the Clean Tatabox Table values (where out-of-frame nucleotides therein were changed into an 'N' value).

The example is from a Russian 'mummy' (Nature):

Full-Genome Table (DNA)
Link and genome info: FN673705.1
Homo sp. Altai complete mitochondrial genome sequence from Denisova
Altai Russia

Nucleotide count: 16,570
Atom	Atom Count	Percent
Carbon	77,685	32.3153
Hydrogen	86,965	36.1756
Nitrogen	60,175	25.0315
Oxygen	15,572	6.4776
Totals	240,397	-----

Tatabox Table
Nucleotide count: 7,110
Atom	Atom Count	Percent
Carbon	33,193	32.3097
Hydrogen	37,358	36.3638
Nitrogen	25,412	24.7357
Oxygen	6,771	6.5908
Totals	102,734	-----

Clean Tatabox Table
Nucleotide count: 7,104
'N' Nucleotide Count: 6
Atom	Atom Count	Percent
Carbon	33,166	32.3098
Hydrogen	37,326	36.3624
Nitrogen	25,394	24.7384
Oxygen	6,764	6.5894
Totals	102,650	-----

III. A Few Appendices

The appendices of today's post, like the ancient Russian version above, also utilize these simple formats: the "Full-Genome Table", the "Tatabox Table", and the "Clean Tatabox Table" (Appendix 1, 2, 3, 4; Updated: appendix 4 added, see this).

This is an improvement over the search required in the previous posts, because from now on all the related table data is going to be presented together instead of being spread apart.

That way readers don't need to search in several places concerning data from the same genome.

Remember also that there is a link to the GenBank data at each Full-Genome Table area where you can access the FASTA nucleotide format (click on the FASTA link at the beginning of the GBFF format that first appears when executing the appendix's link).

Furthermore, the explanation of a "Tatabox" promoter ("TATAAA") and Tatabox terminator ("TATCTC") is clearly shown in the first video below.

In that short video the presenter at MIT also explains how the mRNA is produced from DNA in the nucleus of the cell, but more importantly, how that DNA/mRNA can change substantially with the addition of only one out-of-frame nucleotide being into the DNA strand.

Which means that just a sprinkling of tiny atoms can have a major genetic impact on genomes.

IV. What Does It All Mean Alfie?

The importance of this surprising similarity of atomic structure should not be underestimated.

Scientists are finding out that Alfie may not really know what Alfie has been telling us (Small Brains Considered - 7).

The previous post in this series is here.