Description of v

V_GAUSSMIXK approximate Kullback-Leibler divergence between two GMMs + derivatives

 Inputs: with kf & kg mixtures, p data dimensions

   mf(kf,p)                mixture means for GMM f
   vf(kf,p) or vf(p,p,kf)  variances (diagonal or full) for GMM f [default: identity]
   wf(kf,1)                weights for GMM f - must sum to 1 [default: uniform]
   mg(kg,p)                mixture means for GMM g [g=f if mg,vg,wg omitted]
   vg(kg,p) or vg(p,p,kg)  variances (diagonal or full) for GMM g [default: identity]
   wg(kg,1)                weights for GMM g - must sum to 1 [default: uniform]

 Outputs:
   d             the approximate KL divergence D(f||g)=E_f(log(f(x)/g(x)))
   klfg(kf,kg)   the exact KL divergence between the unweighted components of f and g

 The Kullback-Leibler (KL) divergence, D(f||g), between two distributions,
 f(x) and g(x) is also known as the "relative entropy" of f relative to g.
 It is defined as <log(f(x)/g(x))> where <...> denotes the
 expectation with respect to f(x), i.e. <y(x)> = Integral(f(x) y(x) dx).
 The KL divergence is always >=0 and equals 0 if and only if f(x)=g(x)
 almost everywhere. It is not a distance because it is not symmetric
 between f and g and also does not satisfy the triangle inequality.
 It is normally appropriate for f(x) to be the "true" distribution and
 g(x) to be an approximation to it. See [1].

 This routine calculates the "variational approximation" to the KL divergence
 from [2] that is exact when f and g are single component gaussians and that is zero
 if f=g. However, the approximation may incorrectly be negative if f and g are different.

 Refs:
 [1]    S. Kullback and R. Leibler. On information and sufficiency.
       Annals of Mathematical Statistics, 22 (1): 79-86, 1951. doi: 10.1214/aoms/1177729694.
 [2]    J. R. Hershey and P. A. Olsen.
       Approximating the kullback leibler divergence between gaussian mixture models.
       In Proc. IEEE Intl Conf. Acoustics, Speech and Signal Processing, volume 4,
       pages IV:317-320, Apr. 2007. doi: 10.1109/ICASSP.2007.366913.

0001 function [d,klfg]=v_gaussmixk(mf,vf,wf,mg,vg,wg)
0002 %V_GAUSSMIXK approximate Kullback-Leibler divergence between two GMMs + derivatives
0003 %
0004 % Inputs: with kf & kg mixtures, p data dimensions
0005 %
0006 %   mf(kf,p)                mixture means for GMM f
0007 %   vf(kf,p) or vf(p,p,kf)  variances (diagonal or full) for GMM f [default: identity]
0008 %   wf(kf,1)                weights for GMM f - must sum to 1 [default: uniform]
0009 %   mg(kg,p)                mixture means for GMM g [g=f if mg,vg,wg omitted]
0010 %   vg(kg,p) or vg(p,p,kg)  variances (diagonal or full) for GMM g [default: identity]
0011 %   wg(kg,1)                weights for GMM g - must sum to 1 [default: uniform]
0012 %
0013 % Outputs:
0014 %   d             the approximate KL divergence D(f||g)=E_f(log(f(x)/g(x)))
0015 %   klfg(kf,kg)   the exact KL divergence between the unweighted components of f and g
0016 %
0017 % The Kullback-Leibler (KL) divergence, D(f||g), between two distributions,
0018 % f(x) and g(x) is also known as the "relative entropy" of f relative to g.
0019 % It is defined as <log(f(x)/g(x))> where <...> denotes the
0020 % expectation with respect to f(x), i.e. <y(x)> = Integral(f(x) y(x) dx).
0021 % The KL divergence is always >=0 and equals 0 if and only if f(x)=g(x)
0022 % almost everywhere. It is not a distance because it is not symmetric
0023 % between f and g and also does not satisfy the triangle inequality.
0024 % It is normally appropriate for f(x) to be the "true" distribution and
0025 % g(x) to be an approximation to it. See [1].
0026 %
0027 % This routine calculates the "variational approximation" to the KL divergence
0028 % from [2] that is exact when f and g are single component gaussians and that is zero
0029 % if f=g. However, the approximation may incorrectly be negative if f and g are different.
0030 %
0031 % Refs:
0032 % [1]    S. Kullback and R. Leibler. On information and sufficiency.
0033 %       Annals of Mathematical Statistics, 22 (1): 79-86, 1951. doi: 10.1214/aoms/1177729694.
0034 % [2]    J. R. Hershey and P. A. Olsen.
0035 %       Approximating the kullback leibler divergence between gaussian mixture models.
0036 %       In Proc. IEEE Intl Conf. Acoustics, Speech and Signal Processing, volume 4,
0037 %       pages IV:317-320, Apr. 2007. doi: 10.1109/ICASSP.2007.366913.
0038 
0039 %       Copyright (C) Mike Brookes 2012
0040 %      Version: $Id: v_gaussmixk.m 10865 2018-09-21 17:22:45Z dmb $
0041 %
0042 %   VOICEBOX is a MATLAB toolbox for speech processing.
0043 %   Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html
0044 %
0045 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0046 %   This program is free software; you can redistribute it and/or modify
0047 %   it under the terms of the GNU Lesser General Public License as published by
0048 %   the Free Software Foundation; either version 3 of the License, or
0049 %   (at your option) any later version.
0050 %
0051 %   This program is distributed in the hope that it will be useful,
0052 %   but WITHOUT ANY WARRANTY; without even the implied warranty of
0053 %   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
0054 %   GNU Lesser General Public License for more details.
0055 %
0056 %   You can obtain a copy of the GNU Lesser General Public License from
0057 %   https://www.gnu.org/licenses/ .
0058 %    See files gpl-3.0.txt and lgpl-3.0.txt included in this distribution.
0059 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0060 [kf,p]=size(mf);
0061 if isempty(wf)
0062     wf=repmat(1/kf,kf,1);
0063 end
0064 if isempty(vf)
0065     vf=ones(kf,p);
0066 end
0067 fvf=ndims(vf)>2 || size(vf,1)>kf;       % full covariance matrix vf is supplied
0068 
0069 % First calculate vf covariance matrix determinants and precision matrices
0070 % and then the individual KL divergences between the components of f
0071 
0072 klff=zeros(kf,kf);      % space for intra-f KL negative divergence
0073 ixdf=1:kf+1:kf*kf;      % indexes of diagonal elements in kf*kf matrix
0074 ixdp=(1:p+1:p*p)';      % indexes of diagonal elements in p*p matrix
0075 wkf=ones(kf,1);
0076 if fvf                  % vf is a full matrix
0077     dvf=zeros(kf,1);    % space for log(det(vf))
0078     for i=1:kf
0079         dvf(i)=log(det(vf(:,:,i)));
0080     end
0081     for j=1:kf              % calculate KL divergence between all f mixtures and f mixture j
0082         pfj=inv(vf(:,:,j));                             % precision matrix for f mixture j
0083         mffj=mf-mf(j(wkf),:);                           % difference in means
0084         pfjvf=reshape(pfj*reshape(vf,p,p*kf),p^2,kf);   % pf(:,:,j)* all the vf matices
0085         klff(:,j)=0.5*(dvf(j)-p-dvf+sum(pfjvf(ixdp,:),1)'+sum((mffj*pfj).*mffj,2));
0086     end
0087 else                        % vf is diagonal
0088     dvf=log(prod(vf,2));    % log determinant for all f mixtures
0089     pf=1./vf;               % precision matrices for all f mixtures
0090     mf2p=mf.^2*pf';
0091     mf2pd=mf2p(ixdf);       % get diagonal elements
0092     klff=0.5*(dvf(:,wkf)'-dvf(:,wkf)+vf*pf'-p+mf2p+mf2pd(wkf,:)-2*mf*(mf.*pf)');
0093 end
0094 klff(ixdf)=0;               % force the diagonal elements to exact zero
0095 if nargin<4                 % g mixtures omitted so make them the same as f
0096     d=0;                    % KL divergence from f to f is zero
0097     klfg=klff;
0098 else
0099     [kg,pg]=size(mg);
0100     if pg~=p
0101         error('GMMs must have the same data dimension (%d versus %d)',p,pg);
0102     end
0103     if nargin<6 || isempty(wg)
0104         wg=repmat(1/kg,kg,1);
0105     end
0106     if nargin<5 || isempty(vg)
0107         vg=ones(kg,p);
0108     end
0109 
0110     % Calculate vg covariance matrix determinants and precision matrices
0111     % and then the individual inter-component KL divergences between components of f and g
0112 
0113     klfg=zeros(kf,kg);      % space for inter-f-g KL negative divergence
0114     wkg=ones(kg,1);
0115     if ndims(vg)>2 || size(vg,1)>kg                  % vg is a full matrix
0116         dvg=zeros(kg,1);    % space for log(det(vg))
0117         pg=zeros(p,p,kg);   % space for g-mixture precision matrices
0118         if fvf              % vf and vg are both full matrices
0119             for j=1:kg
0120                 dvgj=log(det(vg(:,:,j)));
0121                 dvg(j)=dvgj;
0122                 pgj=inv(vg(:,:,j));
0123                 pg(:,:,j)=pgj;
0124                 mfgj=mf-mg(j(wkf),:);
0125                 pgjvf=reshape(pgj*reshape(vf,p,p*kf),p^2,kf); % pg(:,:,j) times all the vf matices
0126                 klfg(:,j)=0.5*(dvgj-p-dvf+sum(pgjvf(ixdp,:),1)'+sum((mfgj*pgj).*mfgj,2));
0127             end
0128         else                % vf diagonal but vg is full
0129             for j=1:kg
0130                 dvgj=log(det(vg(:,:,j)));
0131                 dvg(j)=dvgj;
0132                 pgj=inv(vg(:,:,j));
0133                 pg(:,:,j)=pgj;
0134                 mfgj=mf-mg(j(wkf),:);
0135                 klfg(:,j)=0.5*(dvgj-p-dvf+vf*pgj(ixdp)+sum((mfgj*pgj).*mfgj,2));
0136             end
0137         end
0138     else                        % vg is diagonal
0139         dvg=log(prod(vg,2));    % log(det(vg))
0140         pg=1./vg;               % precision matrix pg = inv(vg)
0141         mg2p=sum(mg.^2.*pg,2)';
0142         if fvf                  % vf a full matrix, vg diagonal
0143             vav=reshape(vf,p^2,kf);
0144             klfg=0.5*(dvg(:,wkf)'-dvf(:,wkg)+vav(ixdp,:)'*pg'-p+mf.^2*pg'+mg2p(wkf,:)-2*mf*(mg.*pg)');
0145         else                    % vf and vg both diagonal
0146             klfg=0.5*(dvg(:,wkf)'-dvf(:,wkg)+vf*pg'-p+mf.^2*pg'+mg2p(wkf,:)-2*mf*(mg.*pg)');
0147         end
0148     end
0149     d=wf'*(v_logsum(-klff,2,wf)-v_logsum(-klfg,2,wg));
0150 end

v_gaussmixk

PURPOSE

SYNOPSIS

DESCRIPTION

CROSS-REFERENCE INFORMATION

SOURCE CODE