V_DISTITAR calculates the Itakura distance between AR coefficients D=(AR1,AR2,MODE) Inputs: AR1,AR2 AR coefficient sets to be compared. Each row contains a set of coefficients. AR1 and AR2 must have the same number of columns. MODE Character string selecting the following options: 'x' Calculate the full distance matrix from every row of AR1 to every row of AR2 'd' Calculate only the distance between corresponding rows of AR1 and AR2 The default is 'd' if AR1 and AR2 have the same number of rows otherwise 'x'. 'e' Calculates exp(d) instead of d (quicker because no log is necessary) Output: D If MODE='d' then D is a column vector with the same number of rows as the shorter of AR1 and AR2. If MODE='x' then D is a matrix with the same number of rows as AR1 and the same number of columns as AR2'. If ave() denotes the average over +ve and -ve frequency, the Itakura spectral distance is log(ave(pf1/pf2)) - ave(log(pf1/pf2)) The Itakura distance is gain-independent, i.e. v_distitpf(f*pf1,g*pf2) is independent of f and g. The Itakura distance may be expressed as log(ar2*toeplitz(lpcar2rr(ar1))*ar2') where the ar1 and ar2 polynomials have first been normalised by dividing through by their 0'th order coefficients.

- v_lpcar2ra V_LPCAR2RA Convert ar filter to inverse filter autocorrelation coefs. RA=(AR)
- v_lpcar2rr V_LPCAR2RR Convert autoregressive coefficients to autocorrelation coefficients RR=(AR,P)

- v_fxrapt V_FXRAPT RAPT pitch tracker [FX,VUV]=(S,FS,M,Q)

0001 function d=v_distitar(ar1,ar2,mode) 0002 %V_DISTITAR calculates the Itakura distance between AR coefficients D=(AR1,AR2,MODE) 0003 % 0004 % Inputs: AR1,AR2 AR coefficient sets to be compared. Each row contains a set of coefficients. 0005 % AR1 and AR2 must have the same number of columns. 0006 % 0007 % MODE Character string selecting the following options: 0008 % 'x' Calculate the full distance matrix from every row of AR1 to every row of AR2 0009 % 'd' Calculate only the distance between corresponding rows of AR1 and AR2 0010 % The default is 'd' if AR1 and AR2 have the same number of rows otherwise 'x'. 0011 % 'e' Calculates exp(d) instead of d (quicker because no log is necessary) 0012 % 0013 % Output: D If MODE='d' then D is a column vector with the same number of rows as the shorter of AR1 and AR2. 0014 % If MODE='x' then D is a matrix with the same number of rows as AR1 and the same number of columns as AR2'. 0015 % 0016 % If ave() denotes the average over +ve and -ve frequency, the Itakura spectral distance is 0017 % 0018 % log(ave(pf1/pf2)) - ave(log(pf1/pf2)) 0019 % 0020 % The Itakura distance is gain-independent, i.e. v_distitpf(f*pf1,g*pf2) is independent of f and g. 0021 % 0022 % The Itakura distance may be expressed as log(ar2*toeplitz(lpcar2rr(ar1))*ar2') where the ar1 and ar2 polynomials 0023 % have first been normalised by dividing through by their 0'th order coefficients. 0024 0025 % Since the power spectrum is the fourier transform of the autocorrelation, we can calculate 0026 % the average value of p1/p2 by taking the 0'th order term of the convolution of the autocorrelation 0027 % functions associated with p1 and 1/p2. Since 1/p2 corresponds to an FIR filter, this convolution is 0028 % a finite sum even though the autocorrelation function of p1 is infinite in extent. 0029 % The average value of log(pf1) is equal to log(ar1(1)^-2) where ar1(1) is the 0'th order AR coefficient. 0030 0031 % The Itakura distance can also be calculated directly from the power spectra; providing np is large 0032 % enough, the values of d0 and d1 in the following will be very similar: 0033 % 0034 % np=255; d0=v_distitar(ar1,ar2); d1=v_distitpf(v_lpcar2pf(ar1,np),v_lpcar2pf(ar2,np)) 0035 % 0036 0037 % Ref: A.H.Gray Jr and J.D.Markel, "Distance measures for speech processing", IEEE ASSP-24(5): 380-391, Oct 1976 0038 % L. Rabiner abd B-H Juang, "Fundamentals of Speech Recognition", Section 4.5, Prentice-Hall 1993, ISBN 0-13-015157-2 0039 % F. Itakura, "Minimum prediction residual principle applied to speech recognition", IEEE ASSP-23: 62-72, 1975 0040 0041 % Copyright (C) Mike Brookes 1997 0042 % Version: $Id: v_distitar.m 10865 2018-09-21 17:22:45Z dmb $ 0043 % 0044 % VOICEBOX is a MATLAB toolbox for speech processing. 0045 % Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html 0046 % 0047 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0048 % This program is free software; you can redistribute it and/or modify 0049 % it under the terms of the GNU General Public License as published by 0050 % the Free Software Foundation; either version 2 of the License, or 0051 % (at your option) any later version. 0052 % 0053 % This program is distributed in the hope that it will be useful, 0054 % but WITHOUT ANY WARRANTY; without even the implied warranty of 0055 % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 0056 % GNU General Public License for more details. 0057 % 0058 % You can obtain a copy of the GNU General Public License from 0059 % http://www.gnu.org/copyleft/gpl.html or by writing to 0060 % Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA. 0061 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0062 0063 [nf1,p1]=size(ar1); 0064 nf2=size(ar2,1); 0065 m2=v_lpcar2ra(ar2); 0066 m2(:,1)=0.5*m2(:,1); 0067 if nargin<3 | isempty(mode) mode='0'; end 0068 if any(mode=='d') | (mode~='x' & nf1==nf2) 0069 nx=min(nf1,nf2); 0070 d=2*sum(v_lpcar2rr(ar1(1:nx,:)).*m2(1:nx,:),2).*((ar1(1:nx,1)./ar2(1:nx,1)).^2); 0071 else 0072 d=2*v_lpcar2rr(ar1)*m2'.*((ar1(:,1)*ar2(:,1)'.^(-1)).^2); 0073 end 0074 if all(mode~='e') 0075 d=log(d); 0076 end

Generated by