## Saturday, 24 September 2016

### DRAGON-KINGS, ZIPF'S LAW AND ALL THAT WHAT DO YOU CALL IT?

Chair of Entrepreneurial Risk
ETH Zurich

Professor Sornette gave a highly impressive sounding talk(**** below) on Thursday evening, which was based on the strange premise that some catastrophic events are predictable from finite data sets, He said that all of the hard work had been completed by his research associate Spencer who was seating quietly in the corner. It took me some time to realise that there was very little apparent technical substance in the talk. Indeed the diagnostic statistics he proposed (ratios of sums of order statistics) seemed amazingly naive. Several of us were left wondering whether the scientific claims he was trying to make extrapolated on his actual achievements,

I initiated the discussion by asking why the speaker hadn't referred to or exploited the seminal work during the 1970s by my friend Bruce Hill of the University of Michigan which pioneered the statistical ideas surrounding power law distributions and Zipf's law, He replied to the effect that 'if it was so important then why doesn't anybody remember him?' I responded to the effect that 'I remember him but you don't'

I have now had time to refresh my memory further. Bruce also extended his analysis to other long tailed distributions and provided both Bayesian and frequency justifications. (in order to investigate deviations from power law distributions it would appear essential to first perform a technical sound parametric analysis within this family, and I believe that this analysis would need to be effectively Bayesian with some prior distribution or other):

Bruce Hill (Retired 1998)

## Memoir

Bruce M. Hill
Regents' Proceedings 202
Bruce M. Hill, Ph.D., professor of statistics, retired from active faculty status on December 31, 1998.
Professor Hill received his B.S. degree from the University of Chicago in 1956 and his M.S. and Ph.D. degrees from Stanford University in 1958 and 1961, respectively. He joined the University of Michigan in 1960 as a lecturer in biostatistics. He was promoted to assistant professor in 1961, associate professor in 1964, and professor in 1970.
Professor Hill has conducted research in a number of different areas, including Bayesian nonparametric statistics, the probabilistic theory of urn processes, modeling of long-tailed distributions such as the Zipf-Pareto law, inference about the tail of a distribution, variance components models in the random effects model, decision theory, and the likelihood principle. The article in which he proposed what is now called the Hill tail-index estimator is widely cited and his estimator is applied in a variety of substantive areas dealing with extreme values. In decision theory, he showed how any non-Bayesian real world decision procedure can be routinely improved upon by means of a simple computational algorithm. He also demonstrated the invalidity of the likelihood principle.
Professor Hill is the author of 45 research articles and of 5 articles that appear in encyclopedias of the physical sciences and of statistics. He has also written biographies of Professor L. J. Savage (formerly of the University of Michigan) and Professor Bruno de Finetti (formerly of the University of Rome). He has chaired the doctoral committees of 14 doctoral students, including students from the Departments of Mathematics, Economics, Political Science, Business Statistics, and Biostatistics, in addition to Statistics. He is a fellow of the American Statistical Association, the Institute of Mathematical Statistics, and the International Statistical Institute.
The Regents now salute this faculty member by naming Bruce M. Hill professor emeritus of statistics.
Here are the references which I've managed to trace so far:

(Annals of Statistics 1975, over 2000 citations)

(Journal of the American Statistical Association, 1970)

(Journal of the American Statistical Association 1974)

TAIL PROBABILITIES (2006)

BAYESIAN NONPARAMETRIC PREDICTION AND STATISTICAL INFERENCE
(Springer, 1992), This references an earlier paper, in 1968 which I have not as yet traced.

### Hill's tail-index estimator

Let ${\displaystyle (X_{n},n\geq 1)}$ be a sequence of independent and identically distributed random variables with distribution function ${\displaystyle F\in D(H(\xi ))}$, the maximum domain of attraction of the generalized extreme value distribution ${\displaystyle H}$, where ${\displaystyle \xi \in \mathbb {R} }$. If ${\displaystyle \{k(n)\}}$ is an intermediate order sequence, i.e. ${\displaystyle k(n)\in \{1,...,n-1\},}$${\displaystyle k(n)\to \infty }$ and ${\displaystyle k(n)/n\to 0}$, then the Hill tail-index estimator is[15]
${\displaystyle \xi _{(k(n),n)}^{Hill}=\left({\frac {1}{k(n)}}\sum _{i=n-k(n)+1}^{n}\ln(X_{(i,n)})-\ln(X_{(n-k(n)+1,n)})\right)^{-1},}$
where ${\displaystyle X_{(n-k(n)+1,n)}=\min \left(X_{n-k(n)+1},\ldots ,X_{n}\right)}$. This estimator converges in probability to ${\displaystyle \xi }$, and is asymptotically normal provided ${\displaystyle k(n)\to \infty }$ is restricted based on a higher order regular variation property[16] .[17] Consistency and asymptotic normality extend to a large class of dependent and heterogeneous sequences[18][19]

By taking logs of the observations, this estimator is quite resistant to the effects of outliers. Observations which Professor Sornette thinks are outliers with respect to a power law distribution will not obviously be outliers with respect to a distribution whose tail is estimated in this way. Maybe Professor Sornette should ponder about this when he asserts that some observations are Dragon-Kings. Even if Dragon-Kings can be shown to be outliers, I don't see how it can then be inferred that they can be predicted from finite data sets.

According to my memory, there may well have been some earlier references, during the 1960s

This all seems to be related to a modern trend. Other modern entrepreneurs and some  'Big Data Analysts' seem to be developing a serious tendency to pour s**t on the seminal pioneers who developed the proud history of Statistics, by not even bothering to research the literature (and sometimes by not even bothering to cite anybody of note), What's more,many come up with vastly inferior versions to what has gone before, and this seriously damages the credibility of our ever expanding discipline.

FROM IAN MAIN, Professor of Geophysics:

Dear Tom - Likewise! You also have an interesting and wide-ranging blog.

Just for info we have carried on some of the model selection work, especially in trying to detect non-stationary changes in the underlying rate of earthquake events at

http://www.bssaonline.org.ezproxy.is.ed.ac.uk/content/104/2/885

for an Italian sequence (including triggered events) and

http://gji.oxfordjournals.org/content/204/2/753.full.pdf?keytype=ref&ijkey=J3Xv6qKClDsWJTS

for global megaearthquakes (where triggering is less important)

In the latter it is interesting that BIC performs just about as well as the full Bayes factor in detecting changes in the global rate of mega-earthquakes, and also that we cannot yet the hypothesis that the recent cluser of mega-earthquakes is significant above chance.

I also agree with you it can be hard to detect Didier's 'Dragon Kings' in power law statistics  - see our chapter in his book with Guy Ouillon

http://www.geos.ed.ac.uk/homes/imain/igmpapers/EPJ2012_Main_Naylor.pdf

We conclude "There is no compelling evidence for dragon-kings in the sense of outliers from the power-law size distribution of earthquake event size. In contrast, some volcanic sequences show clear characteristic behaviour, and nearly all laboratory tests show an extreme event at the sample size, well outside the population of acoustic emissions that largely indicate grain scale processes until very late in the cycle".

Cheers, Ian.

****Thursday 22nd Sep. 2016 Dragon-kings: extreme risk events, prediction and control
Venue: ICMS
Address: 15 South College Street, Edinburgh, EH8 9AA
Time: Tea & coffee available from 17:30pm; talk between 18:00-19:00.
Speaker: Prof. Didier Sornette
Affiliation: ETH ZÃ¼rich, Department of Management, Technology, and Economics (D-MTES)
Title: Dragon-kings: extreme risk events, prediction and control
Abstract:
In many complex systems, large events are believed to follow power-law, scale-free probability distributions so that the extreme, catastrophic events are unpredictable. In the last decade, I have spearheaded the concept of "dragon-kings'', these outliers of large sizes and unique origins [1,2]. Our research has shown that most extreme events in fact do not belong to a scale-free distribution. Called dragon-kings, these events are outliers that possess distinct formation mechanisms. Such specific underlying mechanisms open the possibility that dragon-kings can be forecasted, allowing for suppression and control [3]. For certain dynamical systems, it is possible to illustrate the statistical evidence and predictability of dragon-kings.
The approach can be generalised to obtain a conceptual framework to quantify, model and predict crises in out-of-equilibrium open heterogeneous dynamical systems (ie almost all systems of interest) based on a synthesis of the theory of the renormalization group in statistical physics and bifurcation theory in mathematics combined with systematic empirical data analyses. We have recently reviewed the state of the art and some recent progress on the best statistics to detect the dragon-kings in sparse data (the outlier detection problem) [4]. The obtained insights have important implications to address the challenges facing mankind, including finance induced instabilities in worldwide economies, debt instability, cyber-risks [5], industrial and nuclear risks [6], epidemics of obesity and chronic diseases, aging and financial retirement liabilities, the energy challenges, the water problem, the soil erosion run- away, the on-going sixth largest biological extinction, extreme industrial disasters, coupled with geopolitical risks, the problem of the stability of societies that need to steer responsible management of our complex industrial systems. We propose a novel quadrant formulation of risk management along the dimensions of stressor severity and level of predictability [7], in which the dragon-king regime plays a prominent role.
Reference:
[1] D. Sornette, Dragon-Kings, Black Swans and the Prediction of Crises, International Journal of Terraspace Science and Engineering 2 (1), 1-18 (2009) (http://arXiv.org/abs/0907.4290) and (http://ssrn.com/abstract=1470006)
[2] D. Sornette and G. Ouillon, Dragon-kings: mechanisms, statistical methods and empirical evidence, European Physical Journal, Special Topics 205, 1-26 (2012) (special issue on power laws and dragon-kings) (http://arxiv.org/abs/1205.1002)
[3] Hugo L.D. de S. Cavalcante, Marcos Oria, Didier Sornette, Edward Ott and Daniel J. Gauthier, "Predictability and control of extreme events in complex systems", Phys. Rev. Lett. 111, 198701 (2013)
[4] Spencer Wheatley and Didier Sornette, Multiple Outlier Detection in Samples with Exponential and Pareto Tails: Redeeming the Inward Approach and Detecting Dragon Kings, (http://arxiv.org/abs/1507.08689 and http://ssrn.com/abstract=2645709)
[5] Spencer Wheatley, Thomas Maillart and Didier Sornette, The Extreme Risk of Personal Data Breaches & The Erosion of Privacy, Eur. Phys. J. B 89 (7), 1-12 (2016)
[6] Spencer Wheatley, Benjamin Sovacool and Didier Sornette, Of Disasters and Dragon Kings: A Statistical Analysis of Nuclear Power Incidents & Accidents, Risk Analysis DOI: 10.1111/risa.12587, pp. 1-17 (2016)
[7] Tatyana Kovalenko and Didier Sornette, Risk and Resilience Management in Social-Economic Systems, IRGC Resilience In And For Risk Governance (RIARG) resource guide (in press 2016) (http://ssrn.com/abstract=2775264)