# v_gaussmixk

## PURPOSE

V_GAUSSMIXK approximate Kullback-Leibler divergence between two GMMs + derivatives

## SYNOPSIS

function [d,klfg]=v_gaussmixk(mf,vf,wf,mg,vg,wg)

## DESCRIPTION

```V_GAUSSMIXK approximate Kullback-Leibler divergence between two GMMs + derivatives

Inputs: with kf & kg mixtures, p data dimensions

mf(kf,p)                mixture means for GMM f
vf(kf,p) or vf(p,p,kf)  variances (diagonal or full) for GMM f
wf(kf,1)                weights for GMM f - must sum to 1
mg(kg,p)                mixture means for GMM g [g=f if mg,vg,wg omitted]
vg(kg,p) or vg(p,p,kg)  variances (diagonal or full) for GMM g
wg(kg,1)                weights for GMM g - must sum to 1

Outputs:
d             the approximate KL divergence D(f||g)
klfg(kf,kg)   the exact KL divergence between the components of f and g

The Kullback-Leibler (KL) divergence, D(f||g), between two distributions,
f(x) and g(x) is also known as the "relative v_entropy" of f relative to g.
It is defined as <log(f(x)) - log(g(x))> where <y(x)> denotes the
expectation with respect to f(x), i.e. <y(x)> = Integral(f(x) y(x) dx).
The KL divergence is always >=0 and equals 0 if and only if f(x)=g(x)
almost everywhere. % It is not a distance because it is not symmetric
between f and g and also does not satisfy the triangle inequality.
It is normally appropriate for f(x) to be the "true" distribution and
g(x) to be an approximation to it. See [1].

This routine calculates the "variational approximation" to the KL divergence
from [2] that is exact when f and g are single component gaussians and that is zero
if f=g. However, the approximation may be negative if f and g are different.

Refs:
[1]    S. Kullback and R. Leibler. On information and sufficiency.
Annals of Mathematical Statistics, 22 (1): 79�86, 1951. doi: 10.1214/aoms/1177729694.
[2]    J. R. Hershey and P. A. Olsen.
Approximating the kullback leibler divergence between gaussian mixture models.
In Proc. IEEE Intl Conf. Acoustics, Speech and Signal Processing, volume 4,
pages IV�317�IV�320, Apr. 2007. doi: 10.1109/ICASSP.2007.366913.```

## CROSS-REFERENCE INFORMATION

This function calls:
• v_logsum V_LOGSUM v_logsum(x,d,k)=log(sum(k.*exp(x),d))
This function is called by:

## SOURCE CODE

```0001 function [d,klfg]=v_gaussmixk(mf,vf,wf,mg,vg,wg)
0002 %V_GAUSSMIXK approximate Kullback-Leibler divergence between two GMMs + derivatives
0003 %
0004 % Inputs: with kf & kg mixtures, p data dimensions
0005 %
0006 %   mf(kf,p)                mixture means for GMM f
0007 %   vf(kf,p) or vf(p,p,kf)  variances (diagonal or full) for GMM f
0008 %   wf(kf,1)                weights for GMM f - must sum to 1
0009 %   mg(kg,p)                mixture means for GMM g [g=f if mg,vg,wg omitted]
0010 %   vg(kg,p) or vg(p,p,kg)  variances (diagonal or full) for GMM g
0011 %   wg(kg,1)                weights for GMM g - must sum to 1
0012 %
0013 % Outputs:
0014 %   d             the approximate KL divergence D(f||g)
0015 %   klfg(kf,kg)   the exact KL divergence between the components of f and g
0016 %
0017 % The Kullback-Leibler (KL) divergence, D(f||g), between two distributions,
0018 % f(x) and g(x) is also known as the "relative v_entropy" of f relative to g.
0019 % It is defined as <log(f(x)) - log(g(x))> where <y(x)> denotes the
0020 % expectation with respect to f(x), i.e. <y(x)> = Integral(f(x) y(x) dx).
0021 % The KL divergence is always >=0 and equals 0 if and only if f(x)=g(x)
0022 % almost everywhere. % It is not a distance because it is not symmetric
0023 % between f and g and also does not satisfy the triangle inequality.
0024 % It is normally appropriate for f(x) to be the "true" distribution and
0025 % g(x) to be an approximation to it. See [1].
0026 %
0027 % This routine calculates the "variational approximation" to the KL divergence
0028 % from [2] that is exact when f and g are single component gaussians and that is zero
0029 % if f=g. However, the approximation may be negative if f and g are different.
0030 %
0031 % Refs:
0032 % [1]    S. Kullback and R. Leibler. On information and sufficiency.
0033 %       Annals of Mathematical Statistics, 22 (1): 79�86, 1951. doi: 10.1214/aoms/1177729694.
0034 % [2]    J. R. Hershey and P. A. Olsen.
0035 %       Approximating the kullback leibler divergence between gaussian mixture models.
0036 %       In Proc. IEEE Intl Conf. Acoustics, Speech and Signal Processing, volume 4,
0037 %       pages IV�317�IV�320, Apr. 2007. doi: 10.1109/ICASSP.2007.366913.
0038
0039 %       Copyright (C) Mike Brookes 2012
0040 %      Version: \$Id: v_gaussmixk.m 10865 2018-09-21 17:22:45Z dmb \$
0041 %
0042 %   VOICEBOX is a MATLAB toolbox for speech processing.
0044 %
0045 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0046 %   This program is free software; you can redistribute it and/or modify
0048 %   the Free Software Foundation; either version 2 of the License, or
0049 %   (at your option) any later version.
0050 %
0051 %   This program is distributed in the hope that it will be useful,
0052 %   but WITHOUT ANY WARRANTY; without even the implied warranty of
0053 %   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
0054 %   GNU General Public License for more details.
0055 %
0056 %   You can obtain a copy of the GNU General Public License from
0057 %   http://www.gnu.org/copyleft/gpl.html or by writing to
0058 %   Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA.
0059 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0060 [kf,p]=size(mf);
0061 if isempty(wf)
0062     wf=repmat(1/kf,kf,1);
0063 end
0064 if isempty(vf)
0065     vf=ones(kf,p);
0066 end
0067 fvf=ndims(vf)>2 || size(vf,1)>kf;       % full covariance matrix vf is supplied
0068
0069 % First calculate vf covariance matrix determinants and precision matrices
0070 % and then the individual KL divergences between the components of f
0071
0072 klff=zeros(kf,kf);      % space for intra-a KL negative divergence
0073 ixdf=1:kf+1:kf*kf;      % indexes of diagonal elements in kf*kf matrix
0074 ixdp=(1:p+1:p*p)';         % indexes of diagonal elements in p*p matrix
0075 wkf=ones(kf,1);
0076 if fvf                  % vf is a full matrix
0077     dvf=zeros(kf,1);    % space for log(det(vf))
0078     for i=1:kf
0079         dvf(i)=log(det(vf(:,:,i)));
0080     end
0081     for j=1:kf          % calculate KL divergence between all mixtures and mixture j
0082         pfj=inv(vf(:,:,j));
0083         mffj=mf-mf(j(wkf),:);
0084         pfjvf=reshape(pfj*reshape(vf,p,p*kf),p^2,kf);       % pf(:,:,j)* all the vf matices
0085         klff(:,j)=0.5*(dvf(j)-p-dvf+sum(pfjvf(ixdp,:),1)'+sum((mffj*pfj).*mffj,2));
0086     end
0087 else                % vf is diagonal
0088     dvf=log(prod(vf,2));
0089     pf=1./vf;
0090     mf2p=mf.^2*pf';
0091     mf2pd=mf2p(ixdf);       % get diagonal elements
0092     klff=0.5*(dvf(:,wkf)'-dvf(:,wkf)+vf*pf'-p+mf2p+mf2pd(wkf,:)-2*mf*(mf.*pf)');
0093 end
0094 klff(ixdf)=0; % force the diagonal elements to exact zero
0095 if nargin<4
0096     d=0;
0097     klfg=klff;
0098 else
0099     [kg,pg]=size(mg);
0100     if pg~=p
0101         error('GMMs must have the same data dimension (%d versus %d)',p,pg);
0102     end
0103     if nargin<6 || isempty(wg)
0104         wg=repmat(1/kg,kg,1);
0105     end
0106     if nargin<5 || isempty(vg)
0107         vg=ones(kg,p);
0108     end
0109     fvb=ndims(vg)>2 || size(vg,1)>kg;       % full covariance matrix vg is supplied
0110
0111     % Calculate vg covariance matrix determinants and precision matrices
0112     % and then the individual inter-component KL divergences between components of f and g
0113
0114     klfg=zeros(kf,kg);      % space for inter-a-b KL negative divergence
0115     wkg=ones(kg,1);
0116     if fvb                  % vg is a full matrix
0117         dvg=zeros(kg,1);    % space for log(det(vg))
0118         pg=zeros(p,p,kg);   % space for inv(vg)
0119         for j=1:kg
0120             dvgj=log(det(vg(:,:,j)));
0121             dvg(j)=dvgj;
0122             pgj=inv(vg(:,:,j));
0123             pg(:,:,j)=pgj;
0124             mfgj=mf-mg(j(wkf),:);
0125             if fvf              % vf and vg are both full matrices
0126                 pgjvf=reshape(pgj*reshape(vf,p,p*kf),p^2,kf); % pg(:,:,j)* all the vf matices
0127                 klfg(:,j)=0.5*(dvgj-p-dvf+sum(pgjvf(ixdp,:),1)'+sum((mfgj*pgj).*mfgj,2));
0128             else                % vf diagonal but vg is full
0129                 klfg(:,j)=0.5*(dvgj-p-dvf+vf*pgj(ixdp)+sum((mfgj*pgj).*mfgj,2));
0130             end
0131         end
0132     else                        % vg is diagonal
0133         dvg=log(prod(vg,2));    % log(det(vg))
0134         pg=1./vg;               % precision matrix pg = inv(vg)
0135         mg2p=sum(mg.^2.*pg,2)';
0136         if fvf                  % vf a full matrix, vg diagonal
0137             vav=reshape(vf,p^2,kf);
0138             klfg=0.5*(dvg(:,wkf)'-dvf(:,wkg)+vav(ixdp,:)'*pg'-p+mf.^2*pg'+mg2p(wkf,:)-2*mf*(mg.*pg)');
0139         else                    % vf and vg both diagonal
0140             klfg=0.5*(dvg(:,wkf)'-dvf(:,wkg)+vf*pg'-p+mf.^2*pg'+mg2p(wkf,:)-2*mf*(mg.*pg)');
0141         end
0142     end
0143     d=wf'*(v_logsum(-klff,2,wf)-v_logsum(-klfg,2,wg));
0144 end
0145
0146```