We use a notation that applies equally to discrete and continuous distributions
- A distribution function, or cumulative distribution function, is denoted
by a capital letter e.g. F(x). It must satisfy:
- F(x) must exist for all but a countable number of values of x
- F(-inf) = 0, F(+inf) = 1
- F(x) must increase monotonically with x
- If F(x) is differentiable, its derivative is denoted f(x) and is called a frequency
function or probability density function (pdf). We have dF =
dF(x)/dx * dx = f(x)dx.
- A local maximum of f(x) is a mode.
- j = sqrt(-1)
Properties of Distributions
The characteristic function of a distribution is the conjugate of the fourier transform of its pdf:
= Integral ( exp(jtx) dF, x=-inf...+inf). For discrete
distributions it is
= Sum ( p(xk) exp(jtxk)) over all
The usefulness of characteristic functions arises because the characteristic function
of the sum of two independent random variables equals the product of the two
characteristic functions concerned.
The moments of a distribution (about the origin) are given by m0i =
Integral (xi dF, x=-inf...+inf) = (-j)i
m00 always equals 1.
m01 equals the mean of the distribution and, if the integral
converges, is denoted by m.
m0i equals the coefficient of (jt)^i/i! in the power series expansion of
the characteristic function g(jt).
m0i >= 0 for all even i.
The moments about the mean are given by mi = Integral((x-m)i
- m0 always equals 1.
- m1 always equals 0 providing the integral converges.
- m2 is the variance, v. The standard deviation, s, equals
- The skewness is defined as m3/s3.
- The kurtosis is defined as (m4/s4-3). The sign determines
whether a distribution is platykurtic (<0), mesokurtic (=0) or leptokurtic
(>0). Relative to a gaussian, platykurtic distributions are generally less peaky and
leptokurtic distributions more peaky.
mi >= 0 for all even i.
- if mk exists, then so do all mi for i<k.
The moments about an arbitrary point, x, may be obtained by formally exanding mxi
= (m* + (m-x))i and then replacing m*i
- In particular if x=0:
- m02 = m2+m2
- m03 = m3 + 3m m2 + m3
- m04 = m4 + 4m m3 + 6m2 m2 + m4
- Inverse relationships are:
- m2 = m02 - m2
- m3 = m03 - 3m m02 + 2m3
- m4 = m04 - 4m m03 + 6m2 m02 - 3m4
The r'th cumulant of a distribution, kr, is the coefficient of (jt)r/r!
in the power series expansion of loge of the characteristic function, i.e. of
ln(Integral ( exp(jtx) dF, x=-inf...+inf). It may be
obtained from the characteristic function as kr=(-j)i
The cumulants are related to the moments as follows:
- m = k1
- m2 = k2 = v = s2
- m3 = k3
- m4 = k4 + 3k22
- m5 = k5 + 10k3k2
- m6 = k6 + 15k4k2 + 10k32
The formula for mr contains all terms of the form kaA
* kbB * kcC * ... where aA+bB+cC + ... = r and
A,B,C,... are all >= 1 and 2 <= a,b,c,... <= r. The coefficient for a general
term is r!/(A! * a!A * B! * b!B * C! * c!C * ...).
The inverse relationships are
- k1 = m
- k2 = m2 = v = s2
- k3 = m3
- k4 = m4 - 3m22
- k5 = m5 - 10m3m2
- k6 = m6 - 15m4m2 - 10m32
The formula for kr contains all terms of the form maA
* mbB * mcC * ... where aA+bB+cC + ... = r and
A,B,C,... are all >= 1 and 1 < a,b,c,... <= r. The coefficient for a general term
is (-1)k-1(k-1)! r!/(A! * a!A * B! * b!B * C! * c!C
* ...) where k=A+B+C+.... The two sets of coefficients are the same when k=1 and the same
but for their sign when k=2.
We can also define the normalised cumulants gr = kr/sr:
- Skewness: g3 = k3/s3 = m3/s3
- Kurtosis: g4 = k4/s4 = m4/s4
Chebyshev Inequality: Pr(|X-m|>=d) = 1 - F(m+d)
+ F(m-d) <= (s/d)2 gives a rather
weak bound on the sum of the two tail probabilities.
Linear Transformation: Suppose Y=aX+b where X has a pdf f(x)=dF(x)/dx with mean m and
standard deviation s and a characteristic function g(t), then:
The normalised cumulants satisfy
- Y has mean am+b and standard deviation as
- The pdf of Y is f((y-b)/a)/a
- The cdf of Y is F((y-b)/a)
- The characteristic function of Y is ejbtg(at)
- The cumulants of the two distributions are related by
kr(Y) = ar kr(X) for r>1
The identities below are expressed in terms of discrete distributions. The
also work for continuous distributions with sums replaced by integrals. Sx()
denotes the sum over all values of x.
- p(x,y) = p(y|x)p(x)
- p(y|x) = p(x|y)p(y)p(x)-1.
This is Bayes rule.
- p(x) = Sy(p(x,y))
In these distributions, the variable r takes integer values.
- p(r) = ar (1-a)n-r n! / (r! (n-r)!)
for 0<=r<=n where a is a constant 0<=a<=1.
- Characteristic function g(t) = (1-a+aejt)n
- m=na, v=na(1-a), skewness = (1-2a)/sqrt(na(1-a)), kurtosis = (1-6a(1-a))/(na(1-a))
- p(r) = e-a ar / r!
- Characteristic function g(t) = exp(a(ejt-1))
- kr = a for all r>=1
- m=a, v=a, skewness = a-½,
kurtosis = a-1
- Characteristic function g(t) = exp(-|t|)
- Mean = Undefined
- Variance = Undefined
- Skewness = Undefined
- Kurtosis = Undefined
This is the distribution of the sum of the squares of n independent standard gaussian random
variables. If Y=½X, then Y has a gamma distribution with parameter p=½n.
- f(x)=2-½nx½n-1e-½x/(½n-1)! for x>=0. Parameter n>=0.
- [n even] F(x) = 1-Ce-½x
- Characteristic function g(t) = (1-2jt)-½n
- Cumulants: kr=n2r-1(r-1)!
- Mean = n
- Variance = 2n
- Skewness = sqrt(8/n)
- Kurtosis = 12/n
- [n=2] X has an exponential distribution, f(x)=½e-½x
and Y=sqrt(X) has a Rayleigh distribution.
Non-central Chi-squared Distribution
This is the distribution of the sum of the squares of n independent gaussian random
variables with unit variances non-zero means. The non-centrality parameter d
is the sum of the squares of the means [some people call this d2].
- f(x)=2-½ne-½(x+d)Sum((½d)kx½n+k-1/(k!(½n+k-1)!);i=0..infinity) for x>=0. Parameters n,d>=0.
- Mean = n+d
- Variance = 2n+4d
- Skewness = sqrt(8(n+3d)2(n+2d)-3)
- Kurtosis = 12(n+4d)(n+2d)-2
- f(x)=exp(-x) for x>=0.
- Mean = 1
- Variance = 1
- Skewness = 2
- Kurtosis = 6
Fisher's F Distribution
Fisher's z Distribution
Gamma Distribution (Pearson Type III Distribution)
- f(x)=xp-1exp(-x)/(p-1)! for x>=0. Parameter p>=0.
- Mean = p
- Variance = p
- Skewness = 2/sqrt(p)
- Kurtosis = 6/p
Gaussian or Normal Distribution
- f(x) = (2 Pi)-½ exp(-½x2)
- Characteristic function g(t) = exp(-½t2)
- Mode=0, Mean=0, Variance=1, Skewness=0, Kurtosis=0
- Moments: mi = i!/(2i/2(i/2)!)
for even i and mi = 0 for odd i
- Cumulants: k2 = 1 and ki = 0 for i>2
- Mean = 0
- Variance = 2
- Skewness = 0
- Kurtosis = 3
This is a distribution such that ln(x) has a gaussian distribution with mean a
and standard deviation b.
- f(x) = (2 Pi)-½ exp(-½((ln(x)-a)/b)2)/(bx)
- Mode: exp(a-b2)
- Median: exp(a)
- Mean: exp(a+½b2)
- Variance: exp(2a+b2) (exp(b2)-1)
- Moments about 0: m0r = exp(ra+½r2b2)
- f(x)=2(k/w)k x2k-1exp(-kx2/w)/Gamma(k)
- m02 = w
- m0r = (w/k)r/2 Gamma(k+r/2)/Gamma(k)
This distribution arises in communications theory as the magnitude of a component of
the fourier transform of white noise.
- f(x) = x exp(-½x2) for x>=0.
- F(x) = 1-exp(-½x2)
- Mean = Sqrt(½Pi)
- Variance = 2 - ½Pi
- Skewness = (Pi-3) Sqrt(4Pi/(4-Pi)3) = 0.631..
- Kurtosis = (80-24Pi)/(4-Pi)2 - 6 = 0.245..
- f(x)=1 for -½ < x < ½
- F(x)=½+x for -½ < x < ½
- Mean = 0
- Variance = 1/12 = 0.08333
- Skewness = 0
- Kurtosis = -1.2
This distribution arises in communications theory as the magnitude of the fourier
transform of a cosine wave (of amplitude A) corrupted by additive white noise.
- f(x)=x exp(-½(x2+A2)) I0(xA)
Students t Distribution
If x is an n-dimensional multivariate gaussian with mean m and covariance
matrix S then
- its pdf is given by (2*pi)-n/2 |S|-1/2exp(
(x-m)T S-1 (x-m) )
- its characteristic function is g(t)=exp(jtTm-½tTSt)