COPYRIGHT TOM LEONARD
5. Improbable Probabilities
Inter-connectedness has wide-ranging practical implications concerning applications and misapplications of the multiplication laws of probability and conditional probability. See Kahn Academy [1], Breath Math [1].
Geneticists and forensic scientists beware! Grievous numerical errors can be made whenever a population is inaccurately assumed to be in Hardy-Weinberg equilibrium (this is usually ‘justified’ by an imperfect ‘random mating’ assumption), in which case the erroneous multiplication of unconditional probabilities rather than conditional probabilities, e.g. pertaining to a number of inter-connected and not actually independent DNA probes, can exaggerate the forensic evidence against an alleged criminal in extreme fashion, e.g. as in the 1996 Adams Rape Case reported by Donnelly (2005).
As a simple example, suppose that 15 events, each have probability 1 in 4 of each have probability 1 in 4 of occurring.
If the 15 events are assumed independent and unconnected then the probability that they all occur is 0.25 to the power 15, which is 0.9313 times one in a billion. However, if the 15 events are connected and not independent, then the probability of their intersection could be anything between this minuscule number and 0.25.
depending upon how a complexity of conditional probabilities are specified.
In particular, each event might correspond to achieving a so-called ‘perfect match’ on a different DNA probe. In many murder and rape cases, a number e.g. 15 DNA probes, are performed, each comparing the DNA from the defendant with DNA discovered at the crime scene. The ‘perfect matches’ are subject to measurement error and are by no means perfect.
The preceding calculations raise serious ethical issues in murder and rape cases. Courts should of course (α) try to find as many guilty defendants as possible to be guilty (β) declare as many innocent defendants as possible to be innocent (γ ) prosecute all rape cases to the full while protecting the victims against unfair or demeaning cross-examination (δ) encourage the police to prosecute as many rape cases as possible, without demeaning the victims.
However, in many cases, a DNA test has lead to an (incorrectly) calculated miniscule probability of a perfect match on all (incorrectly assumed independent) 15 probes. In such cases, I feel that substantial further, real-life, non-genetic evidence should be considered before any determination of guilt is made. In the Adams rape case, the defendant claimed to have a perfect alibi, and the victim insisted that Adams bore no resemblance to the rapist. That, when taken with the faulty probability calculations on behalf of the prosecution, may well have been enough for the case to be dismissed out of hand.
Moreover, the archaic Essen-Möller formula and the totally farcical concept of the ‘random man’(Essen-Möller, 1938) are still frequently employed by, trustworthy but much too trusting, prosecutors around the world.
Let N denote the UK adult male population size, and suppose that there is evidence that the defendant and the (sole) guilty person are both members of the UK adult male population. Then, in the absence of further evidence, the probability that the defendant is the guilty person is p= 1/N, which accords with Laplace’s Principle of Insufficient Reason (Laplace, 1812, p177).
However, Essen-Möller’s random man has probability p of being any particular member of the population. Let’s consider Essen-Möller’s usual assumption that both the random man and the defendant have ‘prior’ probability 0.5 of guilt.
This mathematical trick implies that the actual probability of guilt for the defendant is 0.5(1+p), which inflates the correct probability p by a factor of (N+1)/2. And that’s before a possibly grossly inflated forensic evidence likelihood-ratio statistic R, based upon the results of the (assumed independent) DNA probes, is used to possibly seriously increase the alleged’ prior’ probability of guilt to an incorrectedly calculated ‘posterior’ probability of guilt of q=R/(R+1).
In districts where the police force are prejudiced against queer people or people of colour, or where, say, there is a large group of interrelated Filipinos, there are a myriad of problems for the courts to untangle. I hope that this exposition will help the courts to do this. In particular, they should try not to let any murderers or rapists get away with it. I am not a fan of the alt right.
The Courts could trying asking an applied statistician with knowledge of practical genetics to consider all the measurements concerning the DNA probes, and to provide a professional, non-probabilistic, appraisal of the data. The sorts of difficulties the applied statistician might face, e.g. when there are groupings of ethnic minorities in the population, are discussed by Various Authors (1990).
Erik Essen-Möller was a psychiatrist and notorious eugenicist who worked with the Nazi sterilizer and mass murderer Ernst Rüdin in Munich, the notorious Franz Kallmann of New York, and the eugenics-friendly Eliot Slater in London, who was said to be a pioneer in the genetics of mental health. See Roelke (2019) and Benbassat (2016). Whether Essen-Möller’s motives were eugenic, when he proposed his unholy desecration q=R/(R+1) of Bayes theorem in the context of parentage testing, is open to debate.
Assume next that, owing to prevailing conditions during AD 3001 in the Baltic Sea, the probability that a female dolphin Carla who is swimming in the Gulf of Finland has the ‘gay’ phenotype Ω is 0.5, and that the conditional probability that any particular daughter has phenotype Ω given that Carla, the mother, has the phenotype Ω, is equal to 0.5 . Then according to Kahn Academy [1]. the probability that both Carla her daughter Zena have phenotype Ω is 0.5x0.5=0.25.
Suppose however that Carla has 14 daughters and consider the event A that Carla and all 14 of her daughters have phenotype Ω. It would be a mistake to obtain prob (A) by multiplying 0.5 by 0.25 to the power 14, giving prob (A)=1.8626 times one in a billion. Since the 15 constituent events are clearly interconnected and not independent, we are not permitted to multiply the unconditional probabilities together.
In order to proceed, we make the further (in itself highly tenuous) assumption, that conditionally on Carla having phenotype Ω, the 14 events that her 14 daughters have gay phenotype Ω are mutually independent. Under this conditional independence assumption, prob (A) = 0.5 x (0.5 to the power 14) =0.00003052. This multiplies the previous miscalculated number by a factor of 2 to the power 14, which equals 16384. Such extremely severe problems abound when trying to calculate the probabilities of any intersection of events in genetics, particularly as the events may be interconnected in all sorts of indecipherable ways.
Kahn Academy [1] Lesson 4: Independent versus dependent events and the multiplication rulehttps://www.khanacademy.org/math/ap-statistics/probability-ap/probability-multiplication-rule/a/general-multiplication-rule Accessed 7 April 2023
Breath Math [1]. The Chain Rule for Probability (Youtube)
https://www.youtube.com/watch?v=v8Uw1TFl2WQ Accessed 31 May 2023
Pierre Simon Laplace (1812) Théorie Analytiques des Probabilitées Paris: Courcier
Peter Donnelly (2005) Appealing Statistics. Significance 2 (1) pp46-48 https://academic.oup.com/jrssig/article/2/1/46/7029499?login=false Accessed 21 April 2023
Erik Essen-Möller (1938) Die Beweiskraft der Ähnlichkeit im Vaterschaftsnachweis; theoretische Grundlagen. Mitt. Anthrop. Ges. (Wien) 68, pp 9–53
Volker Roelke (2019) Eugenic concerns, scientific practices: international relations in the establishment of psychiatric genetics in Germany, Britain, the USA and Scandinavia, c.1910-60 History of Psychiatry 30 (1), p19-37
Carlos Benbassat (2016) Kallmann Syndrome: Eugenics and the Man behind the Eponym Rombam Maimonides Medical Journal 7 (2) e0015
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4839542/