Probability Distibutions
Contents
Notation
We use a notation that applies equally to discrete and continuous distributions
- A distribution function, or cumulative distribution function, is denoted
by a capital letter e.g. F(x). It must satisfy:
- F(x) must exist for all but a countable number of values of x
- F(-inf) = 0, F(+inf) = 1
- F(x) must increase monotonically with x
- If F(x) is differentiable, its derivative is denoted f(x) and is called a frequency
function or probability density function (pdf). We have dF =
dF(x)/dx * dx = f(x)dx.
- A local maximum of f(x) is a mode.
- j = sqrt(-1)
Properties of Distributions
Characteristic Function
The characteristic function of a distribution is the conjugate of the fourier transform of its pdf:
g(t)
= Integral ( exp(jtx) dF, x=-inf...+inf). For discrete
distributions it is
g(t)
= Sum ( p(xk) exp(jtxk)) over all
values xk.
The usefulness of characteristic functions arises because the characteristic function
of the sum of two independent random variables equals the product of the two
characteristic functions concerned.
Moments
The moments of a distribution (about the origin) are given by m0i =
Integral (xi dF, x=-inf...+inf) = (-j)i
(dig/dti)|t=0=
(di(-jg)/dti)|t=0.
-
m00 always equals 1.
-
m01 equals the mean of the distribution and, if the integral
converges, is denoted by m.
-
m0i equals the coefficient of (jt)^i/i! in the power series expansion of
the characteristic function g(jt).
-
m0i >= 0 for all even i.
The moments about the mean are given by mi = Integral((x-m)i
dP, x=-inf...+inf).
- m0 always equals 1.
- m1 always equals 0 providing the integral converges.
- m2 is the variance, v. The standard deviation, s, equals
sqrt(v).
- The skewness is defined as m3/s3.
- The kurtosis is defined as (m4/s4-3). The sign determines
whether a distribution is platykurtic (<0), mesokurtic (=0) or leptokurtic
(>0). Relative to a gaussian, platykurtic distributions are generally less peaky and
leptokurtic distributions more peaky.
-
mi >= 0 for all even i.
- if mk exists, then so do all mi for i<k.
The moments about an arbitrary point, x, may be obtained by formally exanding mxi
= (m* + (m-x))i and then replacing m*i
by mi.
- In particular if x=0:
- m02 = m2+m2
- m03 = m3 + 3m m2 + m3
- m04 = m4 + 4m m3 + 6m2 m2 + m4
- Inverse relationships are:
- m2 = m02 - m2
- m3 = m03 - 3m m02 + 2m3
- m4 = m04 - 4m m03 + 6m2 m02 - 3m4
Cumulants
The r'th cumulant of a distribution, kr, is the coefficient of (jt)r/r!
in the power series expansion of loge of the characteristic function, i.e. of
ln(Integral ( exp(jtx) dF, x=-inf...+inf). It may be
obtained from the characteristic function as kr=(-j)i
(diln(g)/dti)|t=0=(diln(-jg)/dti)|t=0
The cumulants are related to the moments as follows:
- m = k1
- m2 = k2 = v = s2
- m3 = k3
- m4 = k4 + 3k22
- m5 = k5 + 10k3k2
- m6 = k6 + 15k4k2 + 10k32
+ 15k23
The formula for mr contains all terms of the form kaA
* kbB * kcC * ... where aA+bB+cC + ... = r and
A,B,C,... are all >= 1 and 2 <= a,b,c,... <= r. The coefficient for a general
term is r!/(A! * a!A * B! * b!B * C! * c!C * ...).
The inverse relationships are
- k1 = m
- k2 = m2 = v = s2
- k3 = m3
- k4 = m4 - 3m22
- k5 = m5 - 10m3m2
- k6 = m6 - 15m4m2 - 10m32
+ 30m23
The formula for kr contains all terms of the form maA
* mbB * mcC * ... where aA+bB+cC + ... = r and
A,B,C,... are all >= 1 and 1 < a,b,c,... <= r. The coefficient for a general term
is (-1)k-1(k-1)! r!/(A! * a!A * B! * b!B * C! * c!C
* ...) where k=A+B+C+.... The two sets of coefficients are the same when k=1 and the same
but for their sign when k=2.
We can also define the normalised cumulants gr = kr/sr:
- Skewness: g3 = k3/s3 = m3/s3
- Kurtosis: g4 = k4/s4 = m4/s4
- 3
Bounds
Chebyshev Inequality: Pr(|X-m|>=d) = 1 - F(m+d)
+ F(m-d) <= (s/d)2 gives a rather
weak bound on the sum of the two tail probabilities.
Transforming Distributions
Linear Transformation: Suppose Y=aX+b where X has a pdf f(x)=dF(x)/dx with mean m and
standard deviation s and a characteristic function g(t), then:
- Y has mean am+b and standard deviation as
- The pdf of Y is f((y-b)/a)/a
- The cdf of Y is F((y-b)/a)
- The characteristic function of Y is ejbtg(at)
- The cumulants of the two distributions are related by
-
kr(Y) = ar kr(X) for r>1
The normalised cumulants satisfy
Probability Identities
The identities below are expressed in terms of discrete distributions. The
also work for continuous distributions with sums replaced by integrals. Sx()
denotes the sum over all values of x.
- p(x,y) = p(y|x)p(x)
= p(x|y)p(y)
- p(y|x) = p(x|y)p(y)p(x)-1.
This is Bayes rule.
- p(x) = Sy(p(x,y))
- p(x|y)=Sz(p(x,z|y))=Sz(p(x|y,z)p(z))
Discrete Distributions
In these distributions, the variable r takes integer values.
Binomial Distribution
- p(r) = ar (1-a)n-r n! / (r! (n-r)!)
for 0<=r<=n where a is a constant 0<=a<=1.
- Characteristic function g(t) = (1-a+aejt)n
- m=na, v=na(1-a), skewness = (1-2a)/sqrt(na(1-a)), kurtosis = (1-6a(1-a))/(na(1-a))
Poisson Distribution
- p(r) = e-a ar / r!
for r>=0
- Characteristic function g(t) = exp(a(ejt-1))
- kr = a for all r>=1
- m=a, v=a, skewness = a-½,
kurtosis = a-1
Continuous Distributions
Beta Distribution
Cauchy Distribution
- f(x)=Pi-1/(1+x2)
- F(x)=Pi-1tan-1(x)
- Characteristic function g(t) = exp(-|t|)
- Mode=0
- Mean = Undefined
- Variance = Undefined
- Skewness = Undefined
- Kurtosis = Undefined
Chi-Squared Distribution
This is the distribution of the sum of the squares of n independent standard gaussian random
variables. If Y=½X, then Y has a gamma distribution with parameter p=½n.
- f(x)=2-½nx½n-1e-½x/(½n-1)! for x>=0. Parameter n>=0.
- [n even] F(x) = 1-Ce-½x
where C=sum(2-kxk/k!,
k=0..(½n-1))
- Characteristic function g(t) = (1-2jt)-½n
- Cumulants: kr=n2r-1(r-1)!
- Mode=n-2
- Mean = n
- Variance = 2n
- Skewness = sqrt(8/n)
- Kurtosis = 12/n
- [n=2] X has an exponential distribution, f(x)=½e-½x
and Y=sqrt(X) has a Rayleigh distribution.
Non-central Chi-squared Distribution
This is the distribution of the sum of the squares of n independent gaussian random
variables with unit variances non-zero means. The non-centrality parameter d
is the sum of the squares of the means [some people call this d2].
- f(x)=2-½ne-½(x+d)Sum((½d)kx½n+k-1/(k!(½n+k-1)!);i=0..infinity) for x>=0. Parameters n,d>=0.
- Mean = n+d
- Variance = 2n+4d
- Skewness = sqrt(8(n+3d)2(n+2d)-3)
- Kurtosis = 12(n+4d)(n+2d)-2
Exponential Distribution
- f(x)=exp(-x) for x>=0.
- F(x)=1-exp(-x)
- Mode=0
- Mean = 1
- Variance = 1
- Skewness = 2
- Kurtosis = 6
Fisher's F Distribution
Fisher's z Distribution
Gamma Distribution (Pearson Type III Distribution)
- f(x)=xp-1exp(-x)/(p-1)! for x>=0. Parameter p>=0.
- Mode=p-1
- Mean = p
- Variance = p
- Skewness = 2/sqrt(p)
- Kurtosis = 6/p
Gaussian or Normal Distribution
- f(x) = (2 Pi)-½ exp(-½x2)
- Characteristic function g(t) = exp(-½t2)
- Mode=0, Mean=0, Variance=1, Skewness=0, Kurtosis=0
- Moments: mi = i!/(2i/2(i/2)!)
for even i and mi = 0 for odd i
- Cumulants: k2 = 1 and ki = 0 for i>2
Laplace Distribution
- f(x)=½exp(-|x|)
- F(x)=½(1+Sgn(x)(1-exp(-|x|)))
- Mode=0
- Mean = 0
- Variance = 2
- Skewness = 0
- Kurtosis = 3
Lognormal Distribution
This is a distribution such that ln(x) has a gaussian distribution with mean a
and standard deviation b.
- f(x) = (2 Pi)-½ exp(-½((ln(x)-a)/b)2)/(bx)
- Mode: exp(a-b2)
- Median: exp(a)
- Mean: exp(a+½b2)
- Variance: exp(2a+b2) (exp(b2)-1)
- Moments about 0: m0r = exp(ra+½r2b2)
Nakagami Distribution
- f(x)=2(k/w)k x2k-1exp(-kx2/w)/Gamma(k)
- m02 = w
- m0r = (w/k)r/2 Gamma(k+r/2)/Gamma(k)
Rayleigh Distribution
This distribution arises in communications theory as the magnitude of a component of
the fourier transform of white noise.
- f(x) = x exp(-½x2) for x>=0.
- F(x) = 1-exp(-½x2)
- Mode=1
- Mean = Sqrt(½Pi)
- Variance = 2 - ½Pi
- Skewness = (Pi-3) Sqrt(4Pi/(4-Pi)3) = 0.631..
- Kurtosis = (80-24Pi)/(4-Pi)2 - 6 = 0.245..
Rectangular Distribution
- f(x)=1 for -½ < x < ½
- F(x)=½+x for -½ < x < ½
- Mean = 0
- Variance = 1/12 = 0.08333
- Skewness = 0
- Kurtosis = -1.2
Rician Distribution
This distribution arises in communications theory as the magnitude of the fourier
transform of a cosine wave (of amplitude A) corrupted by additive white noise.
- f(x)=x exp(-½(x2+A2)) I0(xA)
Students t Distribution
Multivariate Gaussian
If x is an n-dimensional multivariate gaussian with mean m and covariance
matrix S then
- its pdf is given by (2*pi)-n/2 |S|-1/2exp(
(x-m)T S-1 (x-m) )
- its characteristic function is g(t)=exp(jtTm-½tTSt)