Home > voicebox > distisar.m

distisar

PURPOSE ^

DISTISAR calculates the Itakura-Saito distance between AR coefficients D=(AR1,AR2,MODE)

SYNOPSIS ^

function d=distisar(ar1,ar2,mode)

DESCRIPTION ^

DISTISAR calculates the Itakura-Saito distance between AR coefficients D=(AR1,AR2,MODE)

 Inputs: AR1,AR2     AR coefficient sets to be compared. Each row contains a set of coefficients.
                     AR1 and AR2 must have the same number of columns.

         MODE        Character string selecting the following options:
                         'x'  Calculate the full distance matrix from every row of AR1 to every row of AR2
                         'd'  Calculate only the distance between corresponding rows of AR1 and AR2
                              The default is 'd' if AR1 and AR2 have the same number of rows otherwise 'x'.
           
 Output: D           If MODE='d' then D is a column vector with the same number of rows as the shorter of AR1 and AR2.
                     If MODE='x' then D is a matrix with the same number of rows as AR1 and the same number of columns as AR2'.

 The Itakura-Saito spectral distance is the average over +ve and -ve frequency of 

                      pf1/pf2 - log(pf1/pf2) - 1     =     exp(v) - v - 1         where v=log(pf1/pf2)

 The Itakura-Saito distance is asymmetric: pf1>pf2 contributes more to the distance than pf2>pf1. 
 A symmetrical version is the COSH distance: distchpf(x,y)=(distispf(x,y)+distispf(y,x))/2

 The I-S distance can be expressed as ar2*toeplitz(lpcar2rr(ar1))*ar2' + log((ar1(1)/ar2(1)).^2) - 1
 but this is not how we actually calculate it.

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SOURCE CODE ^

0001 function d=distisar(ar1,ar2,mode)
0002 %DISTISAR calculates the Itakura-Saito distance between AR coefficients D=(AR1,AR2,MODE)
0003 %
0004 % Inputs: AR1,AR2     AR coefficient sets to be compared. Each row contains a set of coefficients.
0005 %                     AR1 and AR2 must have the same number of columns.
0006 %
0007 %         MODE        Character string selecting the following options:
0008 %                         'x'  Calculate the full distance matrix from every row of AR1 to every row of AR2
0009 %                         'd'  Calculate only the distance between corresponding rows of AR1 and AR2
0010 %                              The default is 'd' if AR1 and AR2 have the same number of rows otherwise 'x'.
0011 %
0012 % Output: D           If MODE='d' then D is a column vector with the same number of rows as the shorter of AR1 and AR2.
0013 %                     If MODE='x' then D is a matrix with the same number of rows as AR1 and the same number of columns as AR2'.
0014 %
0015 % The Itakura-Saito spectral distance is the average over +ve and -ve frequency of
0016 %
0017 %                      pf1/pf2 - log(pf1/pf2) - 1     =     exp(v) - v - 1         where v=log(pf1/pf2)
0018 %
0019 % The Itakura-Saito distance is asymmetric: pf1>pf2 contributes more to the distance than pf2>pf1.
0020 % A symmetrical version is the COSH distance: distchpf(x,y)=(distispf(x,y)+distispf(y,x))/2
0021 %
0022 % The I-S distance can be expressed as ar2*toeplitz(lpcar2rr(ar1))*ar2' + log((ar1(1)/ar2(1)).^2) - 1
0023 % but this is not how we actually calculate it.
0024 
0025 
0026 % Since the power spectrum is the fourier transform of the autocorrelation, we can calculate
0027 % the average value of p1/p2 by taking the 0'th order term of the convolution of the autocorrelation
0028 % functions associated with p1 and 1/p2. Since 1/p2 corresponds to an FIR filter, this convolution is
0029 % a finite sum even though the autocorrelation function of p1 is infinite in extent.
0030 % The average value of log(pf1) is equal to log(ar1(1)^-2) where ar1(1) is the 0'th order AR coefficient.
0031 
0032 % The Itakura-Saito distance can also be calculated directly from the power spectra; providing np is large
0033 % enough, the values of d0 and d1 in the following will be very similar:
0034 %
0035 %         np=255; d0=distisar(ar1,ar2); d1=distispf(lpcar2pf(ar1,np),lpcar2pf(ar2,np))
0036 %
0037 % Autocorrelation LPC analysis is equivalent to minimizing the Itakura-Saito difference between the
0038 % signal spectrum and that of the all-pole LPC filter, i.e. distispf(pf,lpcar2pf(ar0,np)).
0039 % Moreover, if ar0 is the LPC filter and ar is any  other all-pole filter, the I-S distance has the
0040 % following additive property:
0041 %
0042 %               distispf(pf,lpcar2pf(ar,np)) = distispf(pf,lpcar2pf(ar0,np)) + distisar(ar0,ar)
0043 
0044 % Ref: A.H.Gray Jr and J.D.Markel, "Distance measures for speech processing", IEEE ASSP-24(5): 380-391, Oct 1976
0045 %      L. Rabiner abd B-H Juang, "Fundamentals of Speech Recognition", Section 4.5, Prentice-Hall 1993, ISBN 0-13-015157-2
0046 %      F.Itakura & S.Saito, "A statistical method for estimation of speech spectral density and formant frequencies",
0047 %                            Electronics & Communications in Japan, 53A: 36-43, 1970.
0048 
0049 %      Copyright (C) Mike Brookes 1997
0050 %      Version: $Id: distisar.m 713 2011-10-16 14:45:43Z dmb $
0051 %
0052 %   VOICEBOX is a MATLAB toolbox for speech processing.
0053 %   Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html
0054 %
0055 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0056 %   This program is free software; you can redistribute it and/or modify
0057 %   it under the terms of the GNU General Public License as published by
0058 %   the Free Software Foundation; either version 2 of the License, or
0059 %   (at your option) any later version.
0060 %
0061 %   This program is distributed in the hope that it will be useful,
0062 %   but WITHOUT ANY WARRANTY; without even the implied warranty of
0063 %   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
0064 %   GNU General Public License for more details.
0065 %
0066 %   You can obtain a copy of the GNU General Public License from
0067 %   http://www.gnu.org/copyleft/gpl.html or by writing to
0068 %   Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA.
0069 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0070 
0071 [nf1,p1]=size(ar1);
0072 nf2=size(ar2,1);
0073 m2=lpcar2ra(ar2);
0074 m2(:,1)=m2(:,1)*0.5;
0075 if nargin<3 | isempty(mode) mode='0'; end
0076 if any(mode=='d') | (mode~='x' & nf1==nf2)
0077    nx=min(nf1,nf2);
0078    d=2*sum(lpcar2rr(ar1(1:nx,:)).*m2(1:nx,:),2)-log((ar2(1:nx,1)./ar1(1:nx,1)).^2)-1;;
0079 else
0080    d=2*lpcar2rr(ar1)*m2'-log((ar1(:,1).^(-1)*ar2(:,1)').^2)-1;
0081 end

Generated on Tue 10-Oct-2017 08:30:10 by m2html © 2003