Probability Distibutions
Contents
Notation
We use a notation that applies equally to discrete and continuous distributions
 A distribution function, or cumulative distribution function, is denoted
by a capital letter e.g. F(x). It must satisfy:
 F(x) must exist for all but a countable number of values of x
 F(inf) = 0, F(+inf) = 1
 F(x) must increase monotonically with x
 If F(x) is differentiable, its derivative is denoted f(x) and is called a frequency
function or probability density function (pdf). We have dF =
dF(x)/dx * dx = f(x)dx.
 A local maximum of f(x) is a mode.
 j = sqrt(1)
Properties of Distributions
Characteristic Function
The characteristic function of a distribution is the conjugate of the fourier transform of its pdf:
g(t)
= Integral ( exp(jtx) dF, x=inf...+inf). For discrete
distributions it is
g(t)
= Sum ( p(x_{k}) exp(jtx_{k})) over all
values x_{k}.
The usefulness of characteristic functions arises because the characteristic function
of the sum of two independent random variables equals the product of the two
characteristic functions concerned.
Moments
The moments of a distribution (about the origin) are given by m0_{i} =
Integral (x^{i} dF, x=inf...+inf) = (j)^{i}
(d^{i}g/dt^{i})_{t=0}=
(d^{i}(jg)/dt^{i})_{t=0}.

m0_{0} always equals 1.

m0_{1} equals the mean of the distribution and, if the integral
converges, is denoted by m.

m0_{i} equals the coefficient of (jt)^i/i! in the power series expansion of
the characteristic function g(jt).

m0_{i} >= 0 for all even i.
The moments about the mean are given by m_{i} = Integral((xm)^{i}
dP, x=inf...+inf).
 m_{0} always equals 1.
 m_{1} always equals 0 providing the integral converges.
 m_{2} is the variance, v. The standard deviation, s, equals
sqrt(v).
 The skewness is defined as m_{3}/s^{3}.
 The kurtosis is defined as (m_{4}/s^{4}3). The sign determines
whether a distribution is platykurtic (<0), mesokurtic (=0) or leptokurtic
(>0). Relative to a gaussian, platykurtic distributions are generally less peaky and
leptokurtic distributions more peaky.

m_{i} >= 0 for all even i.
 if m_{k} exists, then so do all m_{i} for i<k.
The moments about an arbitrary point, x, may be obtained by formally exanding mx_{i}
= (m_{*} + (mx))^{i} and then replacing m_{*}^{i}
by m_{i}.
 In particular if x=0:
 m0_{2} = m_{2}+m^{2}
 m0_{3} = m_{3} + 3m m_{2} + m^{3}
 m0_{4} = m_{4} + 4m m_{3} + 6m^{2} m_{2} + m^{4}
 Inverse relationships are:
 m_{2} = m0_{2}  m^{2}
 m_{3} = m0_{3}  3m m0_{2} + 2m^{3}
 m_{4} = m0_{4}  4m m0_{3} + 6m^{2} m0_{2}  3m^{4}
Cumulants
The r'th cumulant of a distribution, k_{r}, is the coefficient of (jt)^{r}/r!
in the power series expansion of log_{e} of the characteristic function, i.e. of
ln(Integral ( exp(jtx) dF, x=inf...+inf). It may be
obtained from the characteristic function as k_{r}=(j)^{i}
(d^{i}ln(g)/dt^{i})_{t=0}=(d^{i}ln(jg)/dt^{i})_{t=0}
The cumulants are related to the moments as follows:
 m = k_{1}
 m_{2} = k_{2} = v = s^{2}
 m_{3} = k_{3}
 m_{4} = k_{4} + 3k_{2}^{2}
 m_{5} = k_{5} + 10k_{3}k_{2}
 m_{6} = k_{6} + 15k_{4}k_{2} + 10k_{3}^{2}
+ 15k_{2}^{3}
The formula for m_{r} contains all terms of the form k_{a}^{A}
* k_{b}^{B} * k_{c}^{C} * ... where aA+bB+cC + ... = r and
A,B,C,... are all >= 1 and 2 <= a,b,c,... <= r. The coefficient for a general
term is r!/(A! * a!^{A} * B! * b!^{B} * C! * c!^{C} * ...).
The inverse relationships are
 k_{1} = m
 k_{2} = m_{2} = v = s^{2}
 k_{3} = m_{3}
 k_{4} = m_{4}  3m_{2}^{2}
 k_{5} = m_{5}  10m_{3}m_{2}
 k_{6} = m_{6}  15m_{4}m_{2}  10m_{3}^{2}
+ 30m_{2}^{3}
The formula for k_{r} contains all terms of the form m_{a}^{A}
* m_{b}^{B} * m_{c}^{C} * ... where aA+bB+cC + ... = r and
A,B,C,... are all >= 1 and 1 < a,b,c,... <= r. The coefficient for a general term
is (1)^{k1}(k1)! r!/(A! * a!^{A} * B! * b!^{B} * C! * c!^{C}
* ...) where k=A+B+C+.... The two sets of coefficients are the same when k=1 and the same
but for their sign when k=2.
We can also define the normalised cumulants g_{r} = k_{r}/s^{r}:
 Skewness: g_{3} = k_{3}/s^{3} = m_{3}/s^{3}
 Kurtosis: g_{4} = k_{4}/s^{4} = m_{4}/s^{4}
 3
Bounds
Chebyshev Inequality: Pr(Xm>=d) = 1  F(m+d)
+ F(md) <= (s/d)^{2} gives a rather
weak bound on the sum of the two tail probabilities.
Transforming Distributions
Linear Transformation: Suppose Y=aX+b where X has a pdf f(x)=dF(x)/dx with mean m and
standard deviation s and a characteristic function g(t), then:
 Y has mean am+b and standard deviation as
 The pdf of Y is f((yb)/a)/a
 The cdf of Y is F((yb)/a)
 The characteristic function of Y is e^{jbt}g(at)
 The cumulants of the two distributions are related by

k_{1}^{(Y)} = ak_{1}^{(X)}+b

k_{r}^{(Y)} = a^{r} k_{r}^{(X)} for r>1
 The normalised cumulants satisfy

g_{r}^{(Y)} = g_{r}^{(X)} for r>1
Probability Identities
The identities below are expressed in terms of discrete distributions. The
also work for continuous distributions with sums replaced by integrals. S_{x}()
denotes the sum over all values of x.
 p(x,y) = p(yx)p(x)
= p(xy)p(y)
 p(yx) = p(xy)p(y)p(x)^{1}.
This is Bayes rule.
 p(x) = S_{y}(p(x,y))
 p(xy)=S_{z}(p(x,zy))=S_{z}(p(xy,z)p(z))
Discrete Distributions
In these distributions, the variable r takes integer values.
Binomial Distribution
 p(r) = a^{r} (1a)^{nr} n! / (r! (nr)!)
for 0<=r<=n where a is a constant 0<=a<=1.
 Characteristic function g(t) = (1a+ae^{jt})^{n}
 m=na, v=na(1a), skewness = (12a)/sqrt(na(1a)), kurtosis = (16a(1a))/(na(1a))
Poisson Distribution
 p(r) = e^{a} a^{r} / r!
for r>=0
 Characteristic function g(t) = exp(a(e^{jt}1))
 k_{r} = a for all r>=1
 m=a, v=a, skewness = a^{½},
kurtosis = a^{1}
Continuous Distributions
Beta Distribution
Cauchy Distribution
 f(x)=Pi^{1}/(1+x^{2})
 F(x)=Pi^{1}tan^{1}(x)
 Characteristic function g(t) = exp(t)
 Mode=0
 Mean = Undefined
 Variance = Undefined
 Skewness = Undefined
 Kurtosis = Undefined
ChiSquared Distribution
This is the distribution of the sum of the squares of n independent standard gaussian random
variables. If Y=½X, then Y has a gamma distribution with parameter p=½n.
 f(x)=2^{½n}x^{½n1}e^{½x}/(½n1)! for x>=0. Parameter n>=0.
 [n even] F(x) = 1Ce^{½x}
where C=sum(2^{}^{k}x^{k}/k!,
k=0..(½n1))
 Characteristic function g(t) = (12jt)^{½n}
 Cumulants: k_{r}=n2^{r1}(r1)!
 Mode=n2
 Mean = n
 Variance = 2n
 Skewness = sqrt(8/n)
 Kurtosis = 12/n
 [n=2] X has an exponential distribution, f(x)=½e^{½x}
and Y=sqrt(X) has a Rayleigh distribution.
Noncentral Chisquared Distribution
This is the distribution of the sum of the squares of n independent gaussian random
variables with unit variances nonzero means. The noncentrality parameter d
is the sum of the squares of the means [some people call this d^{2}].
 f(x)=2^{½n}e^{½(x+d)}Sum((½d^{)k}x^{½n+k1}/(k!(½n+k1)!);i=0..infinity) for x>=0. Parameters n,d>=0.
 Mean = n+d
 Variance = 2n+4d
 Skewness = sqrt(8(n+3d)^{2}(n+2d)^{3})
 Kurtosis = 12(n+4d)(n+2d)^{2}
Exponential Distribution
 f(x)=exp(x) for x>=0.
 F(x)=1exp(x)
 Mode=0
 Mean = 1
 Variance = 1
 Skewness = 2
 Kurtosis = 6
Fisher's F Distribution
Fisher's z Distribution
Gamma Distribution (Pearson Type III Distribution)
 f(x)=x^{p1}exp(x)/(p1)! for x>=0. Parameter p>=0.
 Mode=p1
 Mean = p
 Variance = p
 Skewness = 2/sqrt(p)
 Kurtosis = 6/p
Gaussian or Normal Distribution
 f(x) = (2 Pi)^{½} exp(½x^{2})
 Characteristic function g(t) = exp(½t^{2})
 Mode=0, Mean=0, Variance=1, Skewness=0, Kurtosis=0
 Moments: m_{i }= i!/(2^{i/2}(i/2)!)
for even i and m_{i }= 0 for odd i
 Cumulants: k_{2 }= 1 and k_{i }= 0 for i>2
Laplace Distribution
 f(x)=½exp(x)
 F(x)=½(1+Sgn(x)(1exp(x)))
 Mode=0
 Mean = 0
 Variance = 2
 Skewness = 0
 Kurtosis = 3
Lognormal Distribution
This is a distribution such that ln(x) has a gaussian distribution with mean a
and standard deviation b.
 f(x) = (2 Pi)^{½} exp(½((ln(x)a)/b)^{2})/(bx)
 Mode: exp(ab^{2})
 Median: exp(a)
 Mean: exp(a+½b^{2})
 Variance: exp(2a+b^{2}) (exp(b^{2})1)
 Moments about 0: m0_{r} = exp(ra+½r^{2}b^{2})
Nakagami Distribution
 f(x)=2(k/w)^{k} x^{2k1}exp(kx^{2}/w)/Gamma(k)
 m0_{2} = w
 m0_{r} = (w/k)^{r/2} Gamma(k+r/2)/Gamma(k)
Rayleigh Distribution
This distribution arises in communications theory as the magnitude of a component of
the fourier transform of white noise.
 f(x) = x exp(½x^{2}) for x>=0.
 F(x) = 1exp(½x^{2})
 Mode=1
 Mean = Sqrt(½Pi)
 Variance = 2  ½Pi
 Skewness = (Pi3) Sqrt(4Pi/(4Pi)^{3}) = 0.631..
 Kurtosis = (8024Pi)/(4Pi)^{2}  6 = 0.245..
Rectangular Distribution
 f(x)=1 for ½ < x < ½
 F(x)=½+x for ½ < x < ½
 Mean = 0
 Variance = 1/12 = 0.08333
 Skewness = 0
 Kurtosis = 1.2
Rician Distribution
This distribution arises in communications theory as the magnitude of the fourier
transform of a cosine wave (of amplitude A) corrupted by additive white noise.
 f(x)=x exp(½(x^{2}+A^{2})) I_{0}(xA)
Students t Distribution
Multivariate Gaussian
If x is an ndimensional multivariate gaussian with mean m and covariance
matrix S then
 its pdf is given by (2*pi)^{n/2} S^{1/2}exp(
(xm)^{T} S^{1} (xm) )
 its characteristic function is g(t)=exp(jt^{T}m½t^{T}St)