Introduction to Kurtosis
In statistics, a measure of kurtosis is a measure of the “tailedness” of the probability distribution of a real-valued random variable. The standard measure of kurtosis is based on a scaled version of the fourth moment of the data or population. Therefore, the measure of kurtosis is related to the tails of the distribution, not its peak.
Table of Contents
Measure of Kurtosis
Sometimes, the Measure of Kurtosis is characterized as a measure of peakedness that is mistaken. A distribution having a relatively high peak is called leptokurtic. A distribution that is flat-topped is called platykurtic. The normal distribution which is neither very peaked nor very flat-topped is also called mesokurtic. The histogram in some cases can be used as an effective graphical technique for showing the skewness and kurtosis of the data set.
Data sets with high kurtosis tend to have a distinct peak near the mean, decline rather rapidly, and have heavy tails. Data sets with low kurtosis tend to have a flat top near the mean rather than a sharp peak.
Moment ratio and Percentile Coefficient of kurtosis are used to measure the kurtosis
Moment Coefficient of Kurtosis= $b_2 = \frac{m_4}{S^2} = \frac{m_4}{m^{2}_{2}}$
Percentile Coefficient of Kurtosis = $k=\frac{Q.D}{P_{90}-P_{10}}$
where Q.D = $\frac{1}{2}(Q_3 – Q_1)$ is the semi-interquartile range. For normal distribution, this has a value of 0.263.
Dr. Wheeler defines kurtosis as:
The kurtosis parameter is a measure of the combined weight of the tails relative to the rest of the distribution.
So, kurtosis is all about the tails of the distribution – not the peakedness or flatness.
A normal random variable has a kurtosis of 3 irrespective of its mean or standard deviation. If a random variable’s kurtosis is greater than 3, it is considered Leptokurtic. If its kurtosis is less than 3, it is considered Platykurtic.
A large value of kurtosis indicates a more serious outlier issue and hence may lead the researcher to choose alternative statistical methods.
Some Examples of Kurtosis
- In finance, risk and insurance are examples of needing to focus on the tail of the distribution and not assuming normality.
- Kurtosis helps in determining whether the resource used within an ecological guild is truly neutral or which it differs among species.
- The accuracy of the variance as an estimate of the population $\sigma^2$ depends heavily on kurtosis.
For further reading see Moments in Statistics
FAQs about Kurtosis
- Define Kurtosis.
- What is the moment coefficient of Kurtosis?
- What is the definition of kurtosis by Dr. Wheeler?
- Give examples of kurtosis from real life.
Here is why “peakedness” is wrong as a descriptor of kurtosis.
Suppose someone tells you that they have calculated negative excess kurtosis either from data or from a probability distribution function (pdf). According to the “peakedness” dogma (started unfortunately by Pearson in 1905), you are supposed to conclude that the distribution is “flat-topped” when graphed. But this is obviously false in general. For one example, the beta(.5,1) has an infinite peak and has negative excess kurtosis. For another example, the 0.5*N(0, 1) + 0.5*N(4,1) distribution is bimodal (wavy); not flat at all, and also has negative excess kurtosis similar to that of the uniform (U(0,1)) distribution. These are just two examples out of an infinite number of other non-flat-topped distributions having negative excess kurtosis.
Yes, the U(0,1) distribution is flat-topped and has negative excess kurtosis. But obviously, a single example does not prove the general case. If that were so, we could say, based on the beta(.5,1) distribution, that negative excess kurtosis implies that the pdf is infinitely pointy. We could also say, based on the 0.5*N(0, 1) + 0.5*N(4,1) distribution, that negative excess kurtosis implies that the pdf is “wavy.” It’s like saying, “well, I know all bears are mammals, so it must be the case that all mammals are bears.”
Now suppose someone tells you that they have calculated positive excess kurtosis from either data or a pdf. According to the “peakedness” dogma (again, started by Pearson in 1905), you are supposed to conclude that the distribution is “peaked” or “pointy” when graphed. But this is also obviously false in general. For example, take a U(0,1) distribution and mix it with a N(0,1000000) distribution, with .00001 mixing probability on the normal. The distribution, when graphed, appears perfectly flat at its peak, but has very high kurtosis.
You can play the same game with any distribution other than U(0,1). If you take a distribution with any shape peak whatsoever, then mix it with a much wider distribution like N(0,1000000), with small mixing probability, you will get a pdf with the same shape of peak (flat, bimodal, trimodal, sinusoidal, whatever) as the original, but with high kurtosis.
And yes, the Laplace distribution has positive excess kurtosis and is pointy. But you can have any shape of the peak whatsoever and have positive excess kurtosis. So the bear/mammal analogy applies again.
One thing that can be said about cases where the data exhibit high kurtosis is that when you draw the histogram, the peak will occupy a narrow vertical strip of the graph. The reason this happens is that there will be a very small proportion of outliers (call them “rare extreme observations” if you do not like the term “outliers”) that occupy most of the horizontal scale, leading to an appearance of the histogram that some have characterized as “peaked” or “concentrated toward the mean.”
But the outliers do not determine the shape of the peak. When you zoom in on the bulk of the data, which is, after all, what is most commonly observed, you can have any shape whatsoever – pointy, inverted U, flat, sinusoidal, bimodal, trimodal, etc.
So, given that someone tells you that there is high kurtosis, all you can legitimately infer, in the absence of any other information, is that there are rare, extreme data points (or potentially observable data points). Other than the rare, extreme data points, you have no idea whatsoever as to what is the shape of the peak without actually drawing the histogram (or pdf), and zooming in on the location of the majority of the (potential) data points.
And given that someone tells you that there is negative excess kurtosis, all you can legitimately infer, in the absence of any other information, is that the outlier characteristic of the data (or pdf) is less extreme than that of a normal distribution. But you will have no idea whatsoever as to what is the shape of the peak, without actually drawing the histogram (or pdf).
The logic for why the kurtosis statistic measures outliers (rare, extreme observations in the case of data; potential rare, extreme observations in the case of a pdf) rather than the peak is actually quite simple. Kurtosis is the average (or expected value in the case of the pdf) of the Z-values, each taken to the 4th power. In the case where there are (potential) outliers, there will be some extremely large Z^4 values, giving a high kurtosis. If there are less outliers than, say, predicted by a normal pdf, then the most extreme Z^4 values will not be particularly large, giving smaller kurtosis.
What of the peak? Well, near the peak, the Z^4 values are extremely small and contribute very little to their overall average (which again, is the kurtosis). That is why kurtosis tells you virtually nothing about the shape of the peak. I give mathematical bounds on the contribution of the data near the peak to the kurtosis measure in the following article:
Kurtosis as Peakedness, 1905 – 2014. R.I.P. The American Statistician, 68, 191–195.
I hope this helps.
Peter Westfall
P.S. The height of the peak is also unrelated to kurtosis; see Kaplansky, I. (1945), “A Common Error Concerning Kurtosis,” Journal of the American Statistical Association, 40, 259). But the “height” misinterpretation also seems to persist.
Most useful site. Allah blessed upon you. Ameen
Thank you. May Almighty Allah also blessed upon you. Ameen
Thanks for update