![]() |
All quiet on the western front |
You've read or heard the phrase "don't blame the victim" I suppose.
So, let's not blame GenBank for what researchers put in it (It's In The GenBank, 2, 3, 4, 5, 6, 7, 8).
But, is there an exception in the case of Tmesipteris oblanceolata (U30836.1)?
That organism was mentioned here on Dredd Blog in "It's In The GenBank - 8" about a year ago on June 1, 2024.
That June 1st post featured a Nature paper that had been published on 31 May 2024, which indicated that for only $32.99 you could read about "Biggest genome ever found belongs to this odd little plant".
Wikipedia indicates this about Tmesipteris oblanceolata:
"On 31 May 2024, Tmesipteris oblanceolata was reported to have been found to contain the largest known eukaryotic genome, with 160 billion base pairs, by comparison more than 50 times larger than the human genome."
(Wikipedia). Today's appendices (Fasta, Gbff) contain the FASTA and the GBFF sequences that were and are in GenBank concerning Tmesipteris oblanceolata.
I placed the FASTA version in a text editor and did searches, which resulted in the following results:
t_count = 343
a_count = 321
c_count = 225
g_count = 297
t-a count = 83
a-t count = 102
c-g count = 33
g-c count = 48
As was stated in a related series one can get a look and feel for the accuracy of hundreds DNA sequences with simple-to-operate and well known software like a word processor or text editor (Genetic Constants In DNA and RNA, 2, 3, 4, 5).
IMO the main fault with inaccurate DNA sequences lies with those who collect and process the samples, not with the curators at GenBank and other public databases.
The previous post in this series is here.
No comments:
Post a Comment