Fig. 1 A World of Measurements |
I. Introduction
The World Ocean Database (WOD) of the National Oceanic and Atmospheric Administration (NOAA) website has a page where in situ measurements of the ocean temperature at various depths (surface to bottom) can be downloaded en masse (Fig. 1).
As you can see, that WOD selection map contains 648 'Zones' covering the entire globe, which for the purposes of this post in this series are being considered as 'containment entities'.
Even though those WOD Zones appear to be of uniform size, it is an optical illusion, because their boundaries (containment entities) are determined by 'imaginary' lines of latitude and longitude.
Fig. 2 Unseen but real containment entity curve balls |
The curvature of the globular Earth results in those Zones containing varying degrees of area (square kilometers) per Zone.
Not to mention that the varying ocean depths surface to bottom also vary.
Thus, the volume of seawater in those Zones is not uniform either.
Nevertheless, this imaginary containment entity is useful in many ways even though the lines are man made containment entities.
Likewise, there are also unseen containment entities that are not imaginary but they actually alter what we see or don't see because space, time, and other physical properties evince some surprising containment phenomena (Fig. 2).
Sometimes we run across these ratios, numerical patterns, and similar portions of containment entities while looking for something else.
For example "Benford's Law" (Benford's law, Project Euclid-Significant-Digit Law, Wikipedia, PDF), of which it is said "there hasn’t yet been a complete answer to the question of why it works." (Statistics).
Regular readers know that Dredd Blog posts have covered the WOD mentioned in Section I above (e.g. World Ocean Database Project, 2, 3; Databases Galore, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24).
Today, I thought I would use some WOD data and see what would happen if Benford's Law was applied.
Fig. 3, Fig. 4, and Fig. 5 show what happens when the formulas concerned with Benford's Law are applied.
I used the World Ocean Database 2018 Updates page, at the "Updated Data (updated files are data changed or replaced since the release of WOD18)" section.
The data had to be decompressed and converted into .CSV files with the following structure: "year, latitude, longitude, temperature, salinity, depth".
In Fig. 3 only the raw in situ ocean temperature values (853,035,512 values) were exposed to the workings of Benford's law.
In Fig. 4 the latitude, longitude, temperature, and depth values (3,408,246,882 values) were exposed to the workings of Benford's law.
In Fig. 5 only the mean averages of in situ temperature measurements (650,818 mean average values) from my SQL database were exposed to the workings of Benford's law.
First Digit | 1st Digit Count | 1st Digit Percent | Benford Percent | Difference |
1 | 242,003,505 | 28.37 | 30.10 | 1.73 |
2 | 193,046,802 | 22.63 | 17.60 | 5.03 |
3 | 107,882,543 | 12.65 | 12.50 | 0.15 |
4 | 86,610,003 | 10.15 | 9.70 | 0.45 |
5 | 62,159,395 | 7.29 | 7.90 | 0.61 |
6 | 48,053,194 | 5.63 | 6.70 | 1.07 |
7 | 41,661,659 | 4.88 | 5.80 | 0.92 |
8 | 37,271,293 | 4.37 | 5.10 | 0.73 |
9 | 34,347,118 | 4.03 | 4.60 | 0.57 |
Fig. 4
First Digit | 1st Digit Count | 1st Digit Percent | Benford Percent | Difference |
1 | 1,163,255,912 | 34.13 | 30.10 | 4.03 |
2 | 485,975,325 | 14.26 | 17.60 | 3.34 |
3 | 416,943,028 | 12.23 | 12.50 | 0.27 |
4 | 355,662,387 | 10.44 | 9.70 | 0.74 |
5 | 272,754,350 | 8.00 | 7.90 | 0.10 |
6 | 220,441,291 | 6.47 | 6.70 | 0.23 |
7 | 188,956,118 | 5.54 | 5.80 | 0.26 |
8 | 159,752,150 | 4.69 | 5.10 | 0.41 |
9 | 144,506,321 | 4.24 | 4.60 | 0.36 |
Fig. 5
First Digit | 1st Digit Count | 1st Digit Percent | Benford Percent | Difference |
1 | 220,087 | 33.82 | 30.10 | 3.72 |
2 | 144,821 | 22.25 | 17.60 | 4.65 |
3 | 71,517 | 10.99 | 12.50 | 1.51 |
4 | 57,274 | 8.80 | 9.70 | 0.90 |
5 | 40,966 | 6.29 | 7.90 | 1.61 |
6 | 33,176 | 5.10 | 6.70 | 1.60 |
7 | 29,325 | 4.51 | 5.80 | 1.29 |
8 | 28,386 | 4.36 | 5.10 | 0.74 |
9 | 25,266 | 3.88 | 4.60 | 0.72 |
III. Closing Comments
The results in Fig. 3, Fig. 4, and Fig. 5 are not unlike those attained by physicists and other professionals that exposed various types of random data to Benford's Law (see PDF).
Like the containment factors that follow the bends of space, which are sanctioned by Astronomers, Benford's Law can be used to detect bad numbers (fraud) in court cases:
"Is Benford’s Law admissible in Court? The answer is yes. If applied correctly, it should be admissible under Daubert and Rule 702. See United States v. Channon (Matthew), No. 16-2254 (10th Cir. 2018).
If you are so inclined, you can convince yourself the law works. Pick up a random book or magazine and sort the numbers – about 30% of the numbers collected will start with the number 1. The result is always the same: 30% start with the number 1. Using this fact, a fraud auditor has a simple tool to examine large sets of numbers for irregular results.
The answer to the above question, “could Benford’s Law have been used to detect the Theranos fabricated blood test results early on and saved investors billions,” is yes. A fraud investigator could have used Benford’s Law to detect the occurrence of fraudulent lab results at Theranos. The Law is not proof of fraud, but it will highlight irregularities for further investigation."
(Theranos, Elizabeth Holmes and Benford’s Law). Perhaps maladjusted ocean and atmospheric temperature data, like fraudulent accounting data, could be used by oceanographers and meteorologists to detect flaws in databases.
The next post in this series is here, the previous post in this series is here.
No comments:
Post a Comment