Home > voicebox > gaussmix.m

gaussmix

PURPOSE ^

GAUSSMIX fits a gaussian mixture pdf to a set of data observations [m,v,w,g,f]=(x,c,l,m0,v0,w0)

SYNOPSIS ^

function [m,v,w,g,f,pp,gg]=gaussmix(x,c,l,m0,v0,w0)

DESCRIPTION ^

GAUSSMIX fits a gaussian mixture pdf to a set of data observations [m,v,w,g,f]=(x,c,l,m0,v0,w0)

 Inputs: n data values, k mixtures, p parameters, l loops

     X(n,p)   Input data vectors, one per row.
     c(1)     Minimum variance of normalized data (Use [] to take default value of 1/n^2)
     L        The integer portion of l gives a maximum loop count. The fractional portion gives
              an optional stopping threshold. Iteration will cease if the increase in
              log likelihood density per data point is less than this value. Thus l=10.001 will
              stop after 10 iterations or when the increase in log likelihood falls below
              0.001.
              As a special case, if L=0, then the first three outputs are omitted.
              Use [] to take default value of 100.0001
     M0(k,p)  Initial mixture means, one row per mixture.
     V0(k,p)  Initial mixture variances, one row per mixture.
      or V0(p,p,k)  one full-covariance matrix per mixture
     W0(k,1)  Initial mixture weights, one per mixture. The weights should sum to unity.

     Alternatively, if initial values for M0, V0 and W0 are not given explicitly:

     M0       Number of mixtures required
     V0       Initialization mode:
                'f'    Initialize with K randomly selected data points [default]
                'p'    Initialize with centroids and variances of random partitions
                'k'    k-means algorithm ('kf' and 'kp' determine initialization of kmeans)
                'h'    k-harmonic means algorithm ('hf' and 'hp' determine initialization of kmeans)
                's'    do not scale data during initialization to have equal variances
                'm'    M0 contains the initial centres
                'v'    full covariance matrices
              Mode 'hf' [the default] generally gives the best results but 'f' is faster and often OK

 Outputs: (Note that M, V and W are omitted if L==0)

     M(k,p)   Mixture means, one row per mixture. (omitted if L==0)
     V(k,p)   Mixture variances, one row per mixture. (omitted if L==0)
       or V(p,p,k) if full covariance matrices in use (i.e. either 'v' option or V0(p,p,k) specified)
     W(k,1)   Mixture weights, one per mixture. The weights will sum to unity. (omitted if L==0)
     G       Average log probability of the input data points.
     F        Fisher's Discriminant measures how well the data divides into classes.
              It is the ratio of the between-mixture variance to the average mixture variance: a
              high value means the classes (mixtures) are well separated.
     PP(n,1)  Log probability of each data point
     GG(l+1,1) Average log probabilities at the beginning of each iteration and at the end

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SOURCE CODE ^

0001 function [m,v,w,g,f,pp,gg]=gaussmix(x,c,l,m0,v0,w0)
0002 %GAUSSMIX fits a gaussian mixture pdf to a set of data observations [m,v,w,g,f]=(x,c,l,m0,v0,w0)
0003 %
0004 % Inputs: n data values, k mixtures, p parameters, l loops
0005 %
0006 %     X(n,p)   Input data vectors, one per row.
0007 %     c(1)     Minimum variance of normalized data (Use [] to take default value of 1/n^2)
0008 %     L        The integer portion of l gives a maximum loop count. The fractional portion gives
0009 %              an optional stopping threshold. Iteration will cease if the increase in
0010 %              log likelihood density per data point is less than this value. Thus l=10.001 will
0011 %              stop after 10 iterations or when the increase in log likelihood falls below
0012 %              0.001.
0013 %              As a special case, if L=0, then the first three outputs are omitted.
0014 %              Use [] to take default value of 100.0001
0015 %     M0(k,p)  Initial mixture means, one row per mixture.
0016 %     V0(k,p)  Initial mixture variances, one row per mixture.
0017 %      or V0(p,p,k)  one full-covariance matrix per mixture
0018 %     W0(k,1)  Initial mixture weights, one per mixture. The weights should sum to unity.
0019 %
0020 %     Alternatively, if initial values for M0, V0 and W0 are not given explicitly:
0021 %
0022 %     M0       Number of mixtures required
0023 %     V0       Initialization mode:
0024 %                'f'    Initialize with K randomly selected data points [default]
0025 %                'p'    Initialize with centroids and variances of random partitions
0026 %                'k'    k-means algorithm ('kf' and 'kp' determine initialization of kmeans)
0027 %                'h'    k-harmonic means algorithm ('hf' and 'hp' determine initialization of kmeans)
0028 %                's'    do not scale data during initialization to have equal variances
0029 %                'm'    M0 contains the initial centres
0030 %                'v'    full covariance matrices
0031 %              Mode 'hf' [the default] generally gives the best results but 'f' is faster and often OK
0032 %
0033 % Outputs: (Note that M, V and W are omitted if L==0)
0034 %
0035 %     M(k,p)   Mixture means, one row per mixture. (omitted if L==0)
0036 %     V(k,p)   Mixture variances, one row per mixture. (omitted if L==0)
0037 %       or V(p,p,k) if full covariance matrices in use (i.e. either 'v' option or V0(p,p,k) specified)
0038 %     W(k,1)   Mixture weights, one per mixture. The weights will sum to unity. (omitted if L==0)
0039 %     G       Average log probability of the input data points.
0040 %     F        Fisher's Discriminant measures how well the data divides into classes.
0041 %              It is the ratio of the between-mixture variance to the average mixture variance: a
0042 %              high value means the classes (mixtures) are well separated.
0043 %     PP(n,1)  Log probability of each data point
0044 %     GG(l+1,1) Average log probabilities at the beginning of each iteration and at the end
0045 
0046 %  Bugs/Suggestions
0047 %     (1) Allow processing in chunks by outputting/reinputting an array of sufficient statistics
0048 %     (2) Other initialization options:
0049 %              'l'    LBG algorithm
0050 %              'm'    Move-means (dog-rabbit) algorithm
0051 %     (3) Allow updating of weights-only, not means/variances
0052 
0053 %      Copyright (C) Mike Brookes 2000-2009
0054 %      Version: $Id: gaussmix.m 1243 2011-12-28 11:55:53Z dmb $
0055 %
0056 %   VOICEBOX is a MATLAB toolbox for speech processing.
0057 %   Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html
0058 %
0059 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0060 %   This program is free software; you can redistribute it and/or modify
0061 %   it under the terms of the GNU General Public License as published by
0062 %   the Free Software Foundation; either version 2 of the License, or
0063 %   (at your option) any later version.
0064 %
0065 %   This program is distributed in the hope that it will be useful,
0066 %   but WITHOUT ANY WARRANTY; without even the implied warranty of
0067 %   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
0068 %   GNU General Public License for more details.
0069 %
0070 %   You can obtain a copy of the GNU General Public License from
0071 %   http://www.gnu.org/copyleft/gpl.html or by writing to
0072 %   Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA.
0073 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0074 
0075 [n,p]=size(x);
0076 mx0=sum(x,1)/n;         % calculate mean and variance of input data
0077 vx0=sum(x.^2,1)/n-mx0.^2;
0078 sx0=sqrt(vx0);
0079 sx0(sx0==0)=1;      % do not divide by zero when scaling
0080 scaled=0;           % data is not yet scaled
0081 memsize=voicebox('memsize');    % set memory size to use
0082 if isempty(c)
0083     c=1/n^2;
0084 else
0085     c=c(1);         % just to prevent legacy code failing
0086 end
0087 fulliv=0;           % initial variance is not full
0088 if isempty(l)
0089     l=100+1e-4;         % max loop count + stopping threshold
0090 end
0091 if nargin<6             % no initial values specified for m0, v0, w0
0092     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0093     %  No initialvalues given, so we must use k-means or equivalent
0094     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0095     if nargin<5
0096         v0='hf';                 % default initialization mode: hf
0097     end
0098     if any(v0=='m')
0099         k=size(m0,1);
0100     else
0101         k=m0;
0102     end
0103     fv=any(v0=='v');                % full covariance matrices requested
0104     if n<=k                         % each data point can have its own mixture
0105         xs=(x-mx0(ones(n,1),:))./sx0(ones(n,1),:);          % scale the data
0106         m=xs(mod((1:k)-1,n)+1,:);    % just include all points several times
0107         v=zeros(k,p);               % will be set to floor later
0108         w=zeros(k,1);
0109         w(1:n)=1/n;
0110         if l>0
0111             l=0.1;                  % no point in iterating
0112         end
0113     else                            % more points than mixtures
0114         if any(v0=='s')
0115             xs=x;                   % do not scale data during initialization
0116         else
0117             xs=(x-mx0(ones(n,1),:))./sx0(ones(n,1),:);  % else scale now
0118             if any(v0=='m')
0119                 m=(m0-mx0(ones(k,1),:))./sx0(ones(k,1),:);  % scale specified means as well
0120             end
0121         end
0122         w=repmat(1/k,k,1);                      % all mixtures equally likely
0123         if any(v0=='k')                         % k-means initialization
0124             if any(v0=='m')
0125                 [m,e,j]=kmeans(xs,k,m);
0126             elseif any(v0=='p')
0127                 [m,e,j]=kmeans(xs,k,'p');
0128             else
0129                 [m,e,j]=kmeans(xs,k,'f');
0130             end
0131         elseif any(v0=='h')                     % k-harmonic means initialization
0132             if any(v0=='m')
0133                 [m,e,j]=kmeanhar(xs,k,[],4,m);
0134             else
0135                 if any(v0=='p')
0136                     [m,e,j]=kmeanhar(xs,k,[],4,'p');
0137                 else
0138                     [m,e,j]=kmeanhar(xs,k,[],4,'f');
0139                 end
0140             end
0141         elseif any(v0=='p')                     % Initialize using a random partition
0142             j=ceil(rand(n,1)*k);                % allocate to random clusters
0143             j(rnsubset(k,n))=1:k;               % but force at least one point per cluster
0144             for i=1:k
0145                 m(i,:)=mean(xs(j==i,:),1);
0146             end
0147         else
0148             if any(v0=='m')
0149                 m=m0;                           % use specified centres
0150             else
0151                 m=xs(rnsubset(k,n),:);          % Forgy initialization: sample k centres without replacement [default]
0152             end
0153             [e,j]=kmeans(xs,k,m,0);             % find out the cluster allocation
0154         end
0155         if any(v0=='s')
0156             xs=(x-mx0(ones(n,1),:))./sx0(ones(n,1),:);      % scale data now if not done previously
0157         end
0158         v=zeros(k,p);                   % diagonal covariances
0159         w=zeros(k,1);
0160         for i=1:k
0161             ni=sum(j==i);               % number assigned to this centre
0162             w(i)=(ni+1)/(n+k);          % weight of this mixture
0163             if ni
0164                 v(i,:)=sum((xs(j==i,:)-repmat(m(i,:),ni,1)).^2,1)/ni;
0165             else
0166                 v(i,:)=zeros(1,p);
0167             end
0168         end
0169     end
0170 else
0171     %%%%%%%%%%%%%%%%%%%%%%%%
0172     % use initial values given as input parameters
0173     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0174     [k,p]=size(m0);
0175     xs=(x-mx0(ones(n,1),:))./sx0(ones(n,1),:);          % scale the data
0176     m=(m0-mx0(ones(k,1),:))./sx0(ones(k,1),:);          % and the means
0177     v=v0;
0178     w=w0;
0179     fv=ndims(v)>2 || size(v,1)>k;                       % full covariance matrix is supplied
0180     if fv
0181         mk=eye(p)==0;                                    % off-diagonal elements
0182         fulliv=any(v(repmat(mk,[1 1 k]))~=0);            % check if any are non-zero
0183         if ~fulliv
0184             v=reshape(v(repmat(~mk,[1 1 k])),p,k)'./repmat(sx0.^2,k,1);   % just pick out and scale the diagonal elements for now
0185         else
0186             v=v./repmat(sx0'*sx0,[1 1 k]);              % scale the full covariance matrix
0187         end
0188     end
0189 end
0190 lsx=sum(log(sx0));
0191 if ~fulliv          % initializing with diagonal covariance
0192     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0193     % Diagonal Covariance matrices  %
0194     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0195     v=max(v,c);         % apply the lower bound
0196     xs2=xs.^2;          % square the data for variance calculations
0197 
0198     % If data size is large then do calculations in chunks
0199 
0200     nb=min(n,max(1,floor(memsize/(8*p*k))));    % chunk size for testing data points
0201     nl=ceil(n/nb);                  % number of chunks
0202     jx0=n-(nl-1)*nb;                % size of first chunk
0203 
0204     im=repmat(1:k,1,nb); im=im(:);
0205     th=(l-floor(l))*n;
0206     sd=(nargout > 3*(l~=0)); % = 1 if we are outputting log likelihood values
0207     lp=floor(l)+sd;   % extra loop needed to calculate final G value
0208 
0209     lpx=zeros(1,n);             % log probability of each data point
0210     wk=ones(k,1);
0211     wp=ones(1,p);
0212     wnb=ones(1,nb);
0213     wnj=ones(1,jx0);
0214 
0215     % EM loop
0216 
0217     g=0;                           % dummy initial value for comparison
0218     gg=zeros(lp+1,1);
0219     ss=sd;                       % initialize stopping count (0 or 1)
0220     for j=1:lp
0221         g1=g;                    % save previous log likelihood (2*pi factor omitted)
0222         m1=m;                       % save previous means, variances and weights
0223         v1=v;
0224         w1=w;
0225         vi=-0.5*v.^(-1);                % data-independent scale factor in exponent
0226         lvm=log(w)-0.5*sum(log(v),2);   % log of external scale factor (excluding -0.5*p*log(2pi) term)
0227 
0228         % first do partial chunk
0229 
0230         jx=jx0;
0231         ii=1:jx;
0232         kk=repmat(ii,k,1);
0233         km=repmat(1:k,1,jx);
0234         py=reshape(sum((xs(kk(:),:)-m(km(:),:)).^2.*vi(km(:),:),2),k,jx)+lvm(:,wnj);
0235         mx=max(py,[],1);                % find normalizing factor for each data point to prevent underflow when using exp()
0236         px=exp(py-mx(wk,:));            % find normalized probability of each mixture for each datapoint
0237         ps=sum(px,1);                   % total normalized likelihood of each data point
0238         px=px./ps(wk,:);                % relative mixture probabilities for each data point (columns sum to 1)
0239         lpx(ii)=log(ps)+mx;
0240         pk=sum(px,2);                   % effective number of data points for each mixture (could be zero due to underflow)
0241         sx=px*xs(ii,:);
0242         sx2=px*xs2(ii,:);
0243 
0244         for il=2:nl
0245             ix=jx+1;
0246             jx=jx+nb;                    % increment upper limit
0247             ii=ix:jx;
0248             kk=repmat(ii,k,1);
0249             py=reshape(sum((xs(kk(:),:)-m(im,:)).^2.*vi(im,:),2),k,nb)+lvm(:,wnb);
0250             mx=max(py,[],1);                % find normalizing factor for each data point to prevent underflow when using exp()
0251             px=exp(py-mx(wk,:));            % find normalized probability of each mixture for each datapoint
0252             ps=sum(px,1);                   % total normalized likelihood of each data point
0253             px=px./ps(wk,:);                % relative mixture probabilities for each data point (columns sum to 1)
0254             lpx(ii)=log(ps)+mx;
0255             pk=pk+sum(px,2);                   % effective number of data points for each mixture (could be zero due to underflow)
0256             sx=sx+px*xs(ii,:);
0257             sx2=sx2+px*xs2(ii,:);
0258         end
0259         g=sum(lpx);                    % total log probability summed over all data points
0260         gg(j)=g;
0261         w=pk/n;                         % normalize to get the weights
0262         if pk                       % if all elements of pk are non-zero
0263             m=sx./pk(:,wp);
0264             v=sx2./pk(:,wp);
0265         else
0266             wm=pk==0;                       % mask indicating mixtures with zero weights
0267             nz=sum(wm);                  % number of zero-weight mixtures
0268             [vv,mk]=sort(lpx);             % find the lowest probability data points
0269             m=zeros(k,p);                   % initialize means and variances to zero (variances are floored later)
0270             v=m;
0271             m(wm,:)=xs(mk(1:nz),:);                % set zero-weight mixture means to worst-fitted data points
0272             w(wm)=1/n;                      % set these weights non-zero
0273             w=w*n/(n+nz);                   % normalize so the weights sum to unity
0274             wm=~wm;                         % mask for non-zero weights
0275             m(wm,:)=sx(wm,:)./pk(wm,wp);  % recalculate means and variances for mixtures with a non-zero weight
0276             v(wm,:)=sx2(wm,:)./pk(wm,wp);
0277         end
0278         v=max(v-m.^2,c);                % apply floor to variances
0279 
0280         if g-g1<=th && j>1
0281             if ~ss, break; end  %  stop
0282             ss=ss-1;       % stop next time
0283         end
0284 
0285     end
0286     if sd && ~fv  % we need to calculate the final probabilities
0287         pp=lpx'-0.5*p*log(2*pi)-lsx;   % log of total probability of each data point
0288         gg=gg(1:j)/n-0.5*p*log(2*pi)-lsx;    % average log prob at each iteration
0289         g=gg(end);
0290         %     gg' % *** DEBUG ***
0291         m=m1;       % back up to previous iteration
0292         v=v1;
0293         w=w1;
0294         mm=sum(m,1)/k;
0295         f=(m(:)'*m(:)-k*mm(:)'*mm(:))/sum(v(:));
0296     end
0297     if ~fv
0298         m=m.*sx0(ones(k,1),:)+mx0(ones(k,1),:);  % unscale means
0299         v=v.*repmat(sx0.^2,k,1);                % and variances
0300     else
0301         v1=v;
0302         v=zeros(p,p,k);
0303         mk=eye(p)==1;                                    % mask for diagonal elements
0304         v(repmat(mk,[1 1 k]))=v1';            % set from v1
0305     end
0306 end
0307 if fv              % check if full covariance matrices were requested
0308     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0309     % Full Covariance matrices  %
0310     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0311     pl=p*(p+1)/2;
0312     lix=1:p^2;
0313     cix=repmat(1:p,p,1);
0314     rix=cix';
0315     lix(cix>rix)=[];                                        % index of lower triangular elements
0316     cix=cix(lix);                                           % index of lower triangular columns
0317     rix=rix(lix);                                           % index of lower triangular rows
0318     dix=find(rix==cix);
0319     lixi=zeros(p,p);
0320     lixi(lix)=1:pl;
0321     lixi=lixi';
0322     lixi(lix)=1:pl;                                        % reverse index to build full matrices
0323     v=reshape(v,p^2,k);
0324     v=v(lix,:)';                                            % lower triangular in rows
0325 
0326     % If data size is large then do calculations in chunks
0327 
0328     nb=min(n,max(1,floor(memsize/(24*p*k))));    % chunk size for testing data points
0329     nl=ceil(n/nb);                  % number of chunks
0330     jx0=n-(nl-1)*nb;                % size of first chunk
0331     %
0332     th=(l-floor(l))*n;
0333     sd=(nargout > 3*(l~=0)); % = 1 if we are outputting log likelihood values
0334     lp=floor(l)+sd;   % extra loop needed to calculate final G value
0335     %
0336     lpx=zeros(1,n);             % log probability of each data point
0337     wk=ones(k,1);
0338     wp=ones(1,p);
0339 
0340     wpl=ones(1,pl);       % 1 index for lower triangular matrix
0341     wnb=ones(1,nb);
0342     wnj=ones(1,jx0);
0343 
0344     % EM loop
0345 
0346     g=0;                           % dummy initial value for comparison
0347     gg=zeros(lp+1,1);
0348     ss=sd;                       % initialize stopping count (0 or 1)
0349     vi=zeros(p*k,p);                    % stack of k inverse cov matrices each size p*p
0350     vim=zeros(p*k,1);                   % stack of k vectors of the form inv(v)*m
0351     mtk=vim;                             % stack of k vectors of the form m
0352     lvm=zeros(k,1);
0353     wpk=repmat((1:p)',k,1);
0354     for j=1:lp
0355         g1=g;                    % save previous log likelihood (2*pi factor omitted)
0356         m1=m;                       % save previous means, variances and weights
0357         v1=v;
0358         w1=w;
0359 
0360         for ik=1:k
0361 
0362             % these lines added for debugging only
0363             %             vk=reshape(v(k,lixi),p,p);
0364             %             condk(ik)=cond(vk);
0365             %%%%%%%%%%%%%%%%%%%%
0366             [uvk,dvk]=eig(reshape(v(ik,lixi),p,p));      % convert lower triangular to full and find eigenvalues
0367             dvk=max(diag(dvk),c);                           % apply variance floor to eigenvalues
0368             vik=-0.5*uvk*diag(dvk.^(-1))*uvk';   % calculate inverse
0369             vi((ik-1)*p+(1:p),:)=vik;           % vi contains all mixture inverses stacked on top of each other
0370             vim((ik-1)*p+(1:p))=vik*m(ik,:)';   % vim contains vi*m for all mixtures stacked on top of each other
0371             mtk((ik-1)*p+(1:p))=m(ik,:)';       % mtk contains all mixture means stacked on top of each other
0372             lvm(ik)=log(w(ik))-0.5*sum(log(dvk));       % vm contains the weighted sqrt of det(vi) for each mixture
0373         end
0374         %
0375         %         % first do partial chunk
0376         %
0377         jx=jx0;
0378         ii=1:jx;
0379         xii=xs(ii,:).';
0380         py=reshape(sum(reshape((vi*xii-vim(:,wnj)).*(xii(wpk,:)-mtk(:,wnj)),p,jx*k),1),k,jx)+lvm(:,wnj);
0381         mx=max(py,[],1);                % find normalizing factor for each data point to prevent underflow when using exp()
0382         px=exp(py-mx(wk,:));  % find normalized probability of each mixture for each datapoint
0383         ps=sum(px,1);                   % total normalized likelihood of each data point
0384         px=px./ps(wk,:);                % relative mixture probabilities for each data point (columns sum to 1)
0385         lpx(ii)=log(ps)+mx;
0386         pk=sum(px,2);                   % effective number of data points for each mixture (could be zero due to underflow)
0387         sx=px*xs(ii,:);
0388         sx2=px*(xs(ii,rix).*xs(ii,cix));            % accumulator for variance calculation (lower tri cov matrix as a row)
0389 
0390         for il=2:nl
0391             ix=jx+1;
0392             jx=jx+nb;        % increment upper limit
0393             ii=ix:jx;
0394             xii=xs(ii,:).';
0395             py=reshape(sum(reshape((vi*xii-vim(:,wnb)).*(xii(wpk,:)-mtk(:,wnb)),p,nb*k),1),k,nb)+lvm(:,wnb);
0396             mx=max(py,[],1);                % find normalizing factor for each data point to prevent underflow when using exp()
0397             px=exp(py-mx(wk,:));  % find normalized probability of each mixture for each datapoint
0398             ps=sum(px,1);                   % total normalized likelihood of each data point
0399             px=px./ps(wk,:);                % relative mixture probabilities for each data point (columns sum to 1)
0400             lpx(ii)=log(ps)+mx;
0401             pk=pk+sum(px,2);                % effective number of data points for each mixture (could be zero due to underflow)
0402             sx=sx+px*xs(ii,:);               % accumulator for mean calculation
0403             sx2=sx2+px*(xs(ii,rix).*xs(ii,cix));            % accumulator for variance calculation
0404         end
0405         g=sum(lpx);                    % total log probability summed over all data points
0406         gg(j)=g;                        % save convergence history
0407         w=pk/n;                         % normalize to get the column of weights
0408         if pk                       % if all elements of pk are non-zero
0409             m=sx./pk(:,wp);         % find mean and mean square
0410             v=sx2./pk(:,wpl);
0411         else
0412             wm=pk==0;                       % mask indicating mixtures with zero weights
0413             nz=sum(wm);                  % number of zero-weight mixtures
0414             [vv,mk]=sort(lpx);             % find the lowest probability data points
0415             m=zeros(k,p);                   % initialize means and variances to zero (variances are floored later)
0416             v=zeros(k,pl);
0417             m(wm,:)=xs(mk(1:nz),:);                % set zero-weight mixture means to worst-fitted data points
0418             w(wm)=1/n;                      % set these weights non-zero
0419             w=w*n/(n+nz);                   % normalize so the weights sum to unity
0420             wm=~wm;                         % mask for non-zero weights
0421             m(wm,:)=sx(wm,:)./pk(wm,wp);  % recalculate means and variances for mixtures with a non-zero weight
0422             v(wm,:)=sx2(wm,:)./pk(wm,wpl);
0423         end
0424         v=v-m(:,cix).*m(:,rix);                 % subtract off mean squared
0425         if g-g1<=th && j>1
0426             if ~ss, break; end  %  stop
0427             ss=ss-1;       % stop next time
0428         end
0429     end
0430     if sd  % we need to calculate the final probabilities
0431         pp=lpx'-0.5*p*log(2*pi)-lsx;   % log of total probability of each data point
0432         gg=gg(1:j)/n-0.5*p*log(2*pi)-lsx;    % average log prob at each iteration
0433         g=gg(end);
0434         %             gg' % *** DEBUG ONLY ***
0435         m=m1;       % back up to previous iteration
0436         v=zeros(p,p,k);
0437         trv=0;      % sum of variance matrix traces
0438         for ik=1:k
0439             [uvk,dvk]=eig(reshape(v1(ik,lixi),p,p));      % convert lower triangular to full and find eigenvectors
0440             dvk=max(diag(dvk),c);                       % apply variance floor
0441             v(:,:,ik)=uvk*diag(dvk)*uvk';          % reconstitute full matrix
0442             trv=trv+sum(dvk);
0443         end
0444         w=w1;
0445         mm=sum(m,1)/k;
0446         f=(m(:)'*m(:)-k*mm(:)'*mm(:))/trv;
0447     else
0448         v1=v;               % lower triangular form
0449         v=zeros(p,p,k);
0450         for ik=1:k
0451             [uvk,dvk,]=eig(reshape(v1(ik,lixi),p,p));      % convert lower triangular to full and find eigenvectors
0452             dvk=max(diag(dvk),c);                       % apply variance floor
0453             v(:,:,ik)=uvk*diag(dvk)*uvk';          % reconstitute full matrix
0454         end
0455     end
0456     m=m.*sx0(ones(k,1),:)+mx0(ones(k,1),:);  % unscale means
0457     v=v.*repmat(sx0'*sx0,[1 1 k]);
0458 end
0459 if l==0         % suppress the first three output arguments if l==0
0460     m=g;
0461     v=f;
0462     w=pp;
0463 end
0464

Generated on Thu 02-Feb-2012 09:15:04 by m2html © 2003