They're Everywhere! |
Today's appendices APNDX 1, APNDX 2, APNDX 3, and APNDX 4 contain 1,069 HTML tables.
Those tables contain numerical data gleaned during analysis of genomes of various organisms from the GenBank database.
The tables in those appendices feature a link to the GenBank data record which contains the actual nucleotide (ACGT) data and other research information.
Rarely, the list of nucleotides is not contained in the initial GBFF link page, in such cases just click on the "FASTA" button at the top left corner of the GenBank page and those ACGT's will be loaded for your perusal.
The HTML tables in the appendices contain the counts of the number of each A, C, G, and T nucleotide individually, as well as the totals thereof.
The atom counts (see Genetic Constants In DNA and RNA) are also included in the said tables.
How well those counts match up to the genetic constants is highlighted.
An added benefit to consider is that the importance of distinguishing DNA from RNA by proper nomenclature is emphasized (that nomenclature is so important that Dredd Blog criticized the inane practice of using a 'T' to represent both DNA thymine and RNA uracil).
That is, I have complained about bad nomenclature which does not adequately respect the difference between DNA and RNA in previous series:
"The most abundant entities in the universe could be said to be viruses (Are There 1031 Virus Particles on Earth, or More, or Fewer?).
And most of them are RNA viruses, not DNA viruses.
One basic difference between RNA and DNA viruses is uracil vs thymine ('U' vs 'T'):
"Uracil
(U) is one of four chemical bases that are part of RNA. The other three
bases are adenine (A), cytosine (C), and guanine (G). In DNA, the base
thymine (T) is used in place of uracil."
(NIH, uracil). What I am UGH!ing about is that in the nucleotide databases "the powers that be" do not use 'U' for uracil (example RNA virus nucleotide data) [go to 'ORIGIN' section @ bottom of the file].
Instead, they use 'T' for thymine in both RNA and DNA sequences even though "In RNA, uracil [U] binds to adenine via two hydrogen bonds. In DNA, the uracil nucleobase is replaced by thymine [T]" (see thymine & uracil links below).
I am not the only one to wonder why they do it that way (Is mRNA sequence on NCBI the actual mRNA with thymine in place of uracil?).
There is "a simple answer" why they tell us about "U" but always use "T" rather than "U": "some of their best friends are germs" (The Genetic Code).
UPDATE: Using 'U' in RNA nucleotide sequencing can be done (Direct RNA Sequencing of the Complete Influenza A Virus Genome; cf. here)."
(Some Of My Best Friends Are Germs). This, even though it is "genetics 101" to know that RNA nucleotides are ACGU and DNA nucleotides are ACGT because it matters.
Anyway, the tables in the appendices show some of the differences.
Cloudy Nomenclature, it's not just for the military anymore (Will The Military Become The Police?, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, Dredd Blog from 2010 to 2025)
The next post in this series is here, the previous post in this series is here.
No comments:
Post a Comment