v_filtbankm

PURPOSE ^

V_FILTBANKM determine matrix for a linear/mel/erb/bark-spaced v_filterbank [X,IL,IH]=(P,N,FS,FL,FH,W)

SYNOPSIS ^

function [x,cf,xi,il,ih]=v_filtbankm(p,n,fs,fl,fh,w)

DESCRIPTION ^

V_FILTBANKM determine matrix for a linear/mel/erb/bark-spaced v_filterbank [X,IL,IH]=(P,N,FS,FL,FH,W)

 Usage:
 (1) Calcuate the Mel-frequency Cepstral Coefficients

        f=v_rfft(s);                                    % v_rfft() returns only 1+floor(n/2) coefficients
         x=v_filtbankm(p,n,fs,0,fs/2,'m');              % n is the fft length, p is the number of filters wanted
         z=log(x*abs(f).^2);                            % multiply the power spectrum by x to get log mel-spectrum
         c=dct(z);                                      % take the DCT to get the mel-cepstrum

 (2) Calcuate the Mel-frequency Cepstral Coefficients efficiently

        f=fft(s);                                      % n is the fft length, p is the number of filters wanted
        [x,cf,na,nb]=v_filtbankm(p,n,fs,0,fs/2,'m');   % na:nb gives the fft bins that are needed
        z=log(x*(f(na:nb)).*conj(f(na:nb)));           % multiply x by the power spectrum
         c=dct(z);                                      % take the DCT

 (3) Plot the calculated filterbanks as a graph or spectrogram

        v_filtbankm(p,n,fs,0,fs/2,'mg');               % use option 'mg' for a graph or 'mG' for a spectrogram

 (4) Convert to mel-spectrum and back again

        [x,cf,xi]=v_filtbankm(p,n,fs,0,fs/2,'mxXzq');  % n is the fft length, p is the number of filters wanted
        f=v_rfft(s);                                    % v_rfft() returns only 1+floor(n/2) coefficients        
         z=x*abs(f).^2;                                 % multiply the power spectrum by x to get mel-spectrum
        gp=xi*z;                                       % multiply by xi to recover the approximate power spectrum
         g=v_irfft(sqrt(gp).*exp(1i*angle(f)));         % take the inverse DFT using the original phase to recover the time domain signal 

 Inputs:
       p   number of filters in v_filterbank or the filter spacing in k-mel/bark/erb (see 'p' and 'P' options) [ceil(4.6*log10(fs))]
        n   length of dft
        fs  sample rate in Hz
        fl  low end of the lowest filter in Hz (or in mel/erb/bark/log10 with 'h' option) [default = 0Hz or, if 'l' option given, 30Hz]
        fh  high end of highest filter in Hz (or in mel/erb/bark/log10 with 'h' option) [default = fs/2]
        w   any sensible combination of the following:

             'b' = bark scale
             'e' = erb-rate scale
             'l' = log10 Hz frequency scale
             'f' = linear frequency scale [default]
             'm' = mel frequency scale

             'n' = round to the nearest FFT bin so each row of x contains only one non-zero entry

             'c' = fl specifies centre of low filters instead of low edge
             'C' = fh specifies centre of high filter instead of high edge
             'h' = fl & fh are in mel/erb/bark/log10 instead of Hz
             'H' = give cf outputs in mel/erb/bark/log10 instead of Hz

              'x' = lowest filter remains at 1 down to 0 frequency
             'X' = highest filter remains at 1 up to nyquist freqency

             'p' = input p specifies the number of filters [default if p>=1]
             'P' = input p specifies the approximate filter spacing in kHz/kmel/... [default if p<1]

             'z' = Treat input power spectrum at 0Hz as an impulse rather than being diffuse
             'Z' = Treat input power spectrum at 0Hz as the sum of an impulse and a continuous component with the same amlitude as the adjacent bin
             'q' = The first output filter gives the power of the impulse at 0Hz (regardless of the 'D' option). 'zq' ensures exact retention of DC component by zi*z

             'd' = input is power spectral density (power per Hz) instead of power
             'D' = output is power spectral density (power per Hz) instead of power

             's' = single-sided input: do not add power from symmetric negative frequencies (i.e. non-DC/Nyquist inputs have already been doubled)
             'S' = single-sided output: include power from both positive and negative frequencies (this doubles the non-DC/Nyquist outputs)
             'w' = size(x,2)=size(xi,1)=n rather than 1+floor(n/2) although the rightmost half of x is all zeros

             'g' = plot filter coefficients as graph
             'G' = plot filter coefficients as spectrogram image [default if no output arguments present]

           Legacy options: 'y'='xX', 'Y'='x', 'yY'='X', 'u'='dD', 'U'='D'

 Outputs:    x(p,k)  a sparse matrix containing the v_filterbank amplitudes
                    If the il and ih output arguments are included then k=ih-il+1 otherwise k=1+floor(n/2)
                   Note that, with the 'S' option, the peak filter values equal 2 to account for the energy in the negative FFT frequencies.
           cf(p)   the v_filterbank centre frequencies in Hz (or in mel/erb/bark/log10 with 'H' option)
           xi(k,p) [optional] sparse matrix that is an approximate inverse of x
            il      the lowest fft bin with a non-zero coefficient
            ih      the highest fft bin with a non-zero coefficient

 The input power will be preserved if the options 'xXS' are given

 The output of the routine is a sparse filterbank matrix. The vector output of the filterbank can then be obtained
 by pre-multiplying an input power spectrum vector by the filterbank matrix. The input and output vectors can optionally
 be in either the power domain or the power spectral density domain.
 The routine implements the filterbank in two conceptual stages (which are merged in the practical implementation):

 Stage 1:
 The discrete input spectrum is converted to a continuous power spectral density using linear interpolation in frequency.
 Each element of the input spectrum influences a frequency interval of width 2d where d is the input frequency bin width.
 The DC component of the input is treated specially in one of three ways: (a) it can be treated as a normal element that
 influences an interval (-d,+d) like the other elements [default]; (b) it can be treated as an impulse at DC ['z' option];
 (c) it can be treated as a mixture of an impulse and a normal component whose value equals that of the adjacent frequency
 bin ['Z' option].

 Stage 2:
 The filterbank outputs are calculated by integrating the product of the continuous spectrum and a filter weight that is
 triangular in the frequency domain. Optionally, the first filterbank preserves the DC impulse component of the continuous
 spectrum ['q' option].

 References:

 [1] S. S. Stevens, J. Volkman, and E. B. Newman. A scale for the measurement
     of the psychological magnitude of pitch. J. Acoust Soc Amer, 8: 185-190, 1937.
 [2] S. Davis and P. Mermelstein. Comparison of parametric representations for
     monosyllabic word recognition in continuously spoken sentences.
     IEEE Trans Acoustics Speech and Signal Processing, 28 (4): 357-366, Aug. 1980.

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SOURCE CODE ^

0001 function [x,cf,xi,il,ih]=v_filtbankm(p,n,fs,fl,fh,w)
0002 %V_FILTBANKM determine matrix for a linear/mel/erb/bark-spaced v_filterbank [X,IL,IH]=(P,N,FS,FL,FH,W)
0003 %
0004 % Usage:
0005 % (1) Calcuate the Mel-frequency Cepstral Coefficients
0006 %
0007 %        f=v_rfft(s);                                    % v_rfft() returns only 1+floor(n/2) coefficients
0008 %         x=v_filtbankm(p,n,fs,0,fs/2,'m');              % n is the fft length, p is the number of filters wanted
0009 %         z=log(x*abs(f).^2);                            % multiply the power spectrum by x to get log mel-spectrum
0010 %         c=dct(z);                                      % take the DCT to get the mel-cepstrum
0011 %
0012 % (2) Calcuate the Mel-frequency Cepstral Coefficients efficiently
0013 %
0014 %        f=fft(s);                                      % n is the fft length, p is the number of filters wanted
0015 %        [x,cf,na,nb]=v_filtbankm(p,n,fs,0,fs/2,'m');   % na:nb gives the fft bins that are needed
0016 %        z=log(x*(f(na:nb)).*conj(f(na:nb)));           % multiply x by the power spectrum
0017 %         c=dct(z);                                      % take the DCT
0018 %
0019 % (3) Plot the calculated filterbanks as a graph or spectrogram
0020 %
0021 %        v_filtbankm(p,n,fs,0,fs/2,'mg');               % use option 'mg' for a graph or 'mG' for a spectrogram
0022 %
0023 % (4) Convert to mel-spectrum and back again
0024 %
0025 %        [x,cf,xi]=v_filtbankm(p,n,fs,0,fs/2,'mxXzq');  % n is the fft length, p is the number of filters wanted
0026 %        f=v_rfft(s);                                    % v_rfft() returns only 1+floor(n/2) coefficients
0027 %         z=x*abs(f).^2;                                 % multiply the power spectrum by x to get mel-spectrum
0028 %        gp=xi*z;                                       % multiply by xi to recover the approximate power spectrum
0029 %         g=v_irfft(sqrt(gp).*exp(1i*angle(f)));         % take the inverse DFT using the original phase to recover the time domain signal
0030 %
0031 % Inputs:
0032 %       p   number of filters in v_filterbank or the filter spacing in k-mel/bark/erb (see 'p' and 'P' options) [ceil(4.6*log10(fs))]
0033 %        n   length of dft
0034 %        fs  sample rate in Hz
0035 %        fl  low end of the lowest filter in Hz (or in mel/erb/bark/log10 with 'h' option) [default = 0Hz or, if 'l' option given, 30Hz]
0036 %        fh  high end of highest filter in Hz (or in mel/erb/bark/log10 with 'h' option) [default = fs/2]
0037 %        w   any sensible combination of the following:
0038 %
0039 %             'b' = bark scale
0040 %             'e' = erb-rate scale
0041 %             'l' = log10 Hz frequency scale
0042 %             'f' = linear frequency scale [default]
0043 %             'm' = mel frequency scale
0044 %
0045 %             'n' = round to the nearest FFT bin so each row of x contains only one non-zero entry
0046 %
0047 %             'c' = fl specifies centre of low filters instead of low edge
0048 %             'C' = fh specifies centre of high filter instead of high edge
0049 %             'h' = fl & fh are in mel/erb/bark/log10 instead of Hz
0050 %             'H' = give cf outputs in mel/erb/bark/log10 instead of Hz
0051 %
0052 %              'x' = lowest filter remains at 1 down to 0 frequency
0053 %             'X' = highest filter remains at 1 up to nyquist freqency
0054 %
0055 %             'p' = input p specifies the number of filters [default if p>=1]
0056 %             'P' = input p specifies the approximate filter spacing in kHz/kmel/... [default if p<1]
0057 %
0058 %             'z' = Treat input power spectrum at 0Hz as an impulse rather than being diffuse
0059 %             'Z' = Treat input power spectrum at 0Hz as the sum of an impulse and a continuous component with the same amlitude as the adjacent bin
0060 %             'q' = The first output filter gives the power of the impulse at 0Hz (regardless of the 'D' option). 'zq' ensures exact retention of DC component by zi*z
0061 %
0062 %             'd' = input is power spectral density (power per Hz) instead of power
0063 %             'D' = output is power spectral density (power per Hz) instead of power
0064 %
0065 %             's' = single-sided input: do not add power from symmetric negative frequencies (i.e. non-DC/Nyquist inputs have already been doubled)
0066 %             'S' = single-sided output: include power from both positive and negative frequencies (this doubles the non-DC/Nyquist outputs)
0067 %             'w' = size(x,2)=size(xi,1)=n rather than 1+floor(n/2) although the rightmost half of x is all zeros
0068 %
0069 %             'g' = plot filter coefficients as graph
0070 %             'G' = plot filter coefficients as spectrogram image [default if no output arguments present]
0071 %
0072 %           Legacy options: 'y'='xX', 'Y'='x', 'yY'='X', 'u'='dD', 'U'='D'
0073 %
0074 % Outputs:    x(p,k)  a sparse matrix containing the v_filterbank amplitudes
0075 %                    If the il and ih output arguments are included then k=ih-il+1 otherwise k=1+floor(n/2)
0076 %                   Note that, with the 'S' option, the peak filter values equal 2 to account for the energy in the negative FFT frequencies.
0077 %           cf(p)   the v_filterbank centre frequencies in Hz (or in mel/erb/bark/log10 with 'H' option)
0078 %           xi(k,p) [optional] sparse matrix that is an approximate inverse of x
0079 %            il      the lowest fft bin with a non-zero coefficient
0080 %            ih      the highest fft bin with a non-zero coefficient
0081 %
0082 % The input power will be preserved if the options 'xXS' are given
0083 %
0084 % The output of the routine is a sparse filterbank matrix. The vector output of the filterbank can then be obtained
0085 % by pre-multiplying an input power spectrum vector by the filterbank matrix. The input and output vectors can optionally
0086 % be in either the power domain or the power spectral density domain.
0087 % The routine implements the filterbank in two conceptual stages (which are merged in the practical implementation):
0088 %
0089 % Stage 1:
0090 % The discrete input spectrum is converted to a continuous power spectral density using linear interpolation in frequency.
0091 % Each element of the input spectrum influences a frequency interval of width 2d where d is the input frequency bin width.
0092 % The DC component of the input is treated specially in one of three ways: (a) it can be treated as a normal element that
0093 % influences an interval (-d,+d) like the other elements [default]; (b) it can be treated as an impulse at DC ['z' option];
0094 % (c) it can be treated as a mixture of an impulse and a normal component whose value equals that of the adjacent frequency
0095 % bin ['Z' option].
0096 %
0097 % Stage 2:
0098 % The filterbank outputs are calculated by integrating the product of the continuous spectrum and a filter weight that is
0099 % triangular in the frequency domain. Optionally, the first filterbank preserves the DC impulse component of the continuous
0100 % spectrum ['q' option].
0101 %
0102 % References:
0103 %
0104 % [1] S. S. Stevens, J. Volkman, and E. B. Newman. A scale for the measurement
0105 %     of the psychological magnitude of pitch. J. Acoust Soc Amer, 8: 185-190, 1937.
0106 % [2] S. Davis and P. Mermelstein. Comparison of parametric representations for
0107 %     monosyllabic word recognition in continuously spoken sentences.
0108 %     IEEE Trans Acoustics Speech and Signal Processing, 28 (4): 357-366, Aug. 1980.
0109 
0110 % Bugs/Suggestions
0111 % (1) default frequencies won't work if the h option is specified
0112 % (2) low default frequency is invalid if the 'l' option is specified
0113 % (3) Add 'z' option to include a DC output as the first coefficient
0114 % (4) Add 'Z' option to ignore the DC input component
0115 % (5) Add 'i' option to calculate the inverse of x instead
0116 % (6) Add option to choose the domain in which linear interpolation is performed
0117 
0118 %      Copyright (C) Mike Brookes 1997-2024
0119 %      Version: $Id: v_filtbankm.m $
0120 %
0121 %   VOICEBOX is a MATLAB toolbox for speech processing.
0122 %   Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html
0123 %
0124 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0125 %   This program is free software; you can redistribute it and/or modify
0126 %   it under the terms of the GNU General Public License as published by
0127 %   the Free Software Foundation; either version 2 of the License, or
0128 %   (at your option) any later version.
0129 %
0130 %   This program is distributed in the hope that it will be useful,
0131 %   but WITHOUT ANY WARRANTY; without even the implied warranty of
0132 %   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
0133 %   GNU General Public License for more details.
0134 %
0135 %   You can obtain a copy of the GNU General Public License from
0136 %   http://www.gnu.org/copyleft/gpl.html or by writing to
0137 %   Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA.
0138 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0139 
0140 % Notes:
0141 % (1) In the comments, "FFT bin_0" assumes DC = bin 0 whereas "FFT bin_1" means DC = bin 1nfout
0142 % (2) "input" and "output" need to be interchanged if the 'i' option is given
0143 
0144 if nargin<6 || isempty(w)               % if no mode option, w, is specified
0145     w='f';                              % default mode option: 'f' = linear output frequency scale
0146 end
0147 wr=max(any(repmat('lebm',length(w),1)==repmat(w',1,4),1).*(1:4));           % output warping: 0=linear,1=log,2=erbrate,3=bark,4=mel
0148 ww=any(repmat('ncChHxXyYpPzZqdDuUsSgGw',length(w),1)==repmat(w',1,23),1);    % decode all other options
0149 % ww elements: 1=n,2=c,3=C,4=h,5=H,6=x,7=X,8=y,9=Y,10=p,11=P,12=z,13=Z,14=q,15=d,16=D,17=u,18=U,19=s,20=S,21=g,22=G,23=w
0150 % Convert legacy option codes: 'y'='xX', 'Y'='x', 'yY'='X', 'u'='dD', 'U'='D'
0151 ww(6)=ww(6) || (ww(8) ~= ww(9));        % convert 'y' or 'Y' (but not both) to 'x'; extend low frequencies
0152 ww(7)=ww(7) || ww(8);                   % convert 'y' to 'X'; extend high frequencies
0153 ww(15)=ww(15) || ww(17);                % convert 'u' to 'd'; input is also in power spectral density
0154 ww(16)=ww(16) || ww(17) || ww(18);      % convert 'u' or 'U' to 'd'; output is in power spectral density
0155 flhconv=repmat(wr>0 && ~ww(4),1,2);     % flag indicating need to convert filterbank limits from Hz to mel/erb/bark/log10
0156 if nargin < 4 || isempty(fl)
0157     fl=30*(wr==1);                      % lower limit is 0 Hz unless 'l' option specified, in which case it is 30 Hz
0158     flhconv(1)=wr>0;
0159 end
0160 if nargin < 5 || isempty(fh)
0161     fh=0.5*fs;                          % max freq is the nyquist frequency
0162     flhconv(2)=wr>0;
0163 end
0164 %
0165 % Sort out input frequency bins
0166 %
0167 if numel(n)>1
0168     error('non-standard input frequency spacing no longer supported');
0169 else                                % n gives dft length
0170     nf=1+floor(n/2);                % number of input positive-frequency bins from DFT
0171     df=fs/n;                        % input frequency bin spacing
0172 end
0173 %
0174 % Sort out output frequency bins
0175 %
0176 mflh=[fl fh];                       % low and high limits of filterbank triangular filters
0177 if any(flhconv)                     % convert mflh from Hz to mel/erb/... unless already converted via 'h' option
0178     switch wr
0179         case 1                      % 'l' = log scaled
0180             if fl<=0
0181                 error('Low frequency limit must be >0 for ''l'' log10-frequency option');
0182             end
0183             mflh(flhconv)=log10(mflh(flhconv));         % convert frequency limits into log10 Hz
0184         case 2                                          % 'e' = erb-rate scaled
0185             mflh(flhconv)=v_frq2erb(mflh(flhconv));     % convert frequency limits into erb-rate
0186         case 3                                          % 'b' = bark scaled
0187             mflh(flhconv)=v_frq2bark(mflh(flhconv));    % convert frequency limits into bark
0188         case 4                                          % 'm' = mel scaled
0189             mflh(flhconv)=v_frq2mel(mflh(flhconv));     % convert frequency limits into mel
0190     end
0191 end
0192 melrng=mflh(2)-mflh(1);                                 % filterbank range in Hz/mel/erb/...
0193 if isempty(p)
0194     p=ceil(4.6*log10(2*(nf-1)*df));                     % default number of output filters
0195 end
0196 puc=ww(11) || (p<1) && ~ww(10);                         % input p specifies the filter spacing rather than the number of filters
0197 if puc
0198     p=round(melrng/(p*1000))+ww(2)+ww(3)-1+ww(14);      % p now gives the number of filters (excluding DC impulse)
0199 end
0200 melinc=melrng/(p+ww(2)+ww(3)+1-ww(14));                 % inter-filter increment in mel
0201 mflh=mflh+[-ww(2) ww(3)]*melinc;                        % update mflh to include the full width of all filters
0202 %
0203 % Calculate the output centre frequencies in Hz including dummy end points
0204 %
0205 pmq=p-ww(14);                                       % number of filters excluding the one for the DC impulse
0206 cf=mflh(1)+(0:pmq+1)*melinc;                        % centre frequencies in mel/erb/... including dummy ends
0207 cf(2:end)=max(cf(2:end),0);                         % only the first point can be negative [*** doesn't make sense for log scale ***]
0208 switch wr                                           % convert centre frequencies to Hz from mel/erb/...
0209     case 1                                          % 'l' = log scaled
0210         mb=10.^(cf);
0211     case 2                                          % 'e' = erb-rate scaled
0212         mb=v_erb2frq(cf);
0213     case 3                                          % 'b' = bark scaled
0214         mb=v_bark2frq(cf);
0215     case 4                                          % 'm' = mel scaled
0216         mb=v_mel2frq(cf);
0217     otherwise                                       % [default] = linear scaled; no conversion needed
0218         mb=cf;
0219 end
0220 %
0221 % sort out 2-sided input frequencies
0222 %
0223 fin=(-nf:nf)*df;                                    % reflect negative frequencies excluding DC
0224 nfin=length(fin);                                   % length of extended input frequency list         [nfin=2*nf+1]
0225 %
0226 % now sort out the list of output frequencies
0227 %
0228 fout=mb;                                            % output centre frequencies in Hz including dummy values at each end
0229 highex=ww(7) && (fout(end-1)<fin(end));             % extend at high end if 'X' specified and final centre frequency < Nyquist
0230 if ww(6)                                            % ww(6)='x': extend first filter at low end to DC
0231     fout=[0 0 fout(2:end)];                         % ... add two dummy values at DC instead of previous single dummy value
0232 end
0233 if highex                                           % extend last filter at high end to Nyquist
0234     fout=[fout(1:end-1) fin(end) fin(end)];         % ... add two dummy values at Nyquist instead of previous single dummy value
0235 end
0236 fout=min(fout,fs/2);                                % limit output filters to Nyquist frequency
0237 nfout=length(fout);                                 % number of output filters including one or two dummy points at each end
0238 foutin=[fout fin];
0239 nfall=length(foutin);                               % = nfout + nfin
0240 wleft=[0 fout(2:nfout)-fout(1:nfout-1) 0 fin(2:nfin)-fin(1:nfin-1)]; % width of lower triangle attached to each node
0241 wright=[wleft(2:end) 0];                            % width of upper triangle attached to each node
0242 ffact=[0 ones(1,nfout-2) 0 0 ones(1,2*nf-1) 0];     % valid triangle posts
0243 ffact(wleft+wright==0)=0;                           % disable null width triangles (*** probably unnecessary if all frequencies are distinct ***)
0244 [fall,ifall]=sort(foutin);                          % fall is sorted frequencies with fall=foutin(ifall)
0245 jfall=zeros(1,nfall);                               % create inverse index ...
0246 infall=1:nfall;                                     % ...
0247 jfall(ifall)=infall;                                % ... inverse-index satisfying foutin=fall(jfall)
0248 ffact(ifall([1:max(jfall(1),jfall(nfout+1))-2 min(jfall(nfout),jfall(nfall))+2:nfall]))=0;  % zap input nodes that lie outside the output filters
0249 nxto=cumsum(ifall<=nfout);                          % next output node to the right (or equal) to each node
0250 nxti=cumsum(ifall>nfout);                           % number of input nodes to the left (or equal) to each node
0251 nxtr=min(nxti+1+nfout,nfall);                       % next input node to the right of each value (or nfall if none)
0252 nxtr(ifall>nfout)=1+nxto(ifall>nfout);              % next post to the right of opposite input/output type (using sorted indexes)
0253 nxtr=nxtr(jfall);                                   % next post to the right of opposite input/output type (converted to unsorted indices) or if none: nfall or (nfout+1)
0254 %
0255 % The interpolated spectrum at any frequency can be expressed as the sum of the values at the adjacent input bins
0256 % multiplied by triangular weights that decreases from 1 to 0 between the two bins. The value at an output bin
0257 % is equal to the integral of the interpolated spectrum multiplied by a triangular weight that decreases from
0258 % 1 to 0 either side of the output bin. Thus, if all input/output bins are sorted into ascending order, the
0259 % interval between two adjacent bins contains four partial triangles (a.k.a. trapeziums): two "lower" triangles
0260 % that increase with frequency and two "upper" triangles that decrease with frequency. We need to integrate the
0261 % resultant four input-output trapezium products and add the integrals onto the sum for the appropriate output bins.
0262 % Each triangle has a "post" at one end and is zero at the other end; we enumerate the triangle pairs by pairing
0263 % all input and output triangles with the first available triangle of the other type (i.e. output or input) whose
0264 % rightmost node is to the right of the entire first triangle.
0265 %
0266 % The general result for integrating the product of two trapesiums with
0267 % heights (a,b) and (c,d) over a width x is (ad+bc+2bd+2ac)*x/6
0268 %
0269 % integrate product of lower triangles whose posts (and rightmost nodes) are ix1 and jx1
0270 %
0271 msk0=(ffact>0);                                     % posts with a non-zero magnitude
0272 msk=msk0 & (ffact(nxtr)>0);                         % select triangle pairs with both posts having non-zero magnitudes
0273 ix1=infall(msk);                                    % unsorted indices of leftmost post of pair
0274 jx1=nxtr(msk);                                      % unsorted indices of rightmost post of pair
0275 vfgx=foutin(ix1)-foutin(jx1-1);                     % portion of triangle attached to rightmost post that lies to the left of the leftmost post
0276 yx=min(wleft(ix1),vfgx);                            % integration length. Maybe more efficient: dfall=diff(fall); yx=dfall(jfall(ix1)-1)
0277 wx1=ffact(ix1).*ffact(jx1).*yx.*(wleft(ix1).*vfgx-yx.*(0.5*(wleft(ix1)+vfgx)-yx/3))./(wleft(ix1).*wleft(jx1)+(yx==0));
0278 
0279 % integrate product of upper triangles whose posts are ix2 and jx2 and whose rightmost nodes are ix2+1 and jx2+1
0280 
0281 nxtu=max([nxtr(2:end)-1 0],1);                      % post of the upper triangle of opposite type whose rightmost end is to the right of this triangle's rightmost end
0282 msk=msk0 & (ffact(nxtu)>0);                         % select triangle pairs with both posts having non-zero magnitudes
0283 ix2=infall(msk);                                    % unsorted indices of leftmost post of pair
0284 jx2=nxtu(msk);                                      % unsorted indices of rightmost post of pair
0285 vfgx=foutin(ix2+1)-foutin(jx2);                     % length of left triangle to the right of the right post
0286 yx=min(wright(ix2),vfgx);                           % integration length
0287 yx(foutin(jx2+1)<foutin(ix2+1))=0;                  % zap invalid triangles where the rightmost ends are in the wrong order
0288 wx2=ffact(ix2).*ffact(jx2).*yx.^2.*((0.5*(wright(jx2)-vfgx)+yx/3))./(wright(ix2).*wright(jx2)+(yx==0));
0289 
0290 % integrate lower triangle and upper triangle that ends to its right
0291 
0292 nxtu=max(nxtr-1,1);                                 % post of the upper triangle of opposite type whose rightmost end is to the right of this triangle's post
0293 msk=msk0 & (ffact(nxtu)>0);                         % select triangle pairs with both posts having non-zero magnitudes
0294 ix3=infall(msk);                                    % unsorted indices of lower triangle
0295 jx3=nxtu(msk);                                      % unsorted indices of upper triangle
0296 vfgx=foutin(ix3)-foutin(jx3);                       % length of upper triangle to the left of the lower post
0297 yx=min(wleft(ix3),vfgx);                            % integration length
0298 yx(foutin(jx3+1)<foutin(ix3))=0;                    % zap invalid triangles where the rightmost ends are in the wrong order
0299 wx3=ffact(ix3).*ffact(jx3).*yx.*(wleft(ix3).*(wright(jx3)-vfgx)+yx.*(0.5*(wleft(ix3)-wright(jx3)+vfgx)-yx/3))./(wleft(ix3).*wright(jx3)+(yx==0));
0300 
0301 % integrate upper triangle and lower triangle that starts to its right
0302 
0303 nxtu=[nxtr(2:end) 1];
0304 msk=msk0 & (ffact(nxtu)>0);                         % select triangle pairs with both posts having non-zero magnitudes
0305 ix4=infall(msk);                                    % unsorted indices of upper triangle
0306 jx4=nxtu(msk);                                      % unsorted indices of lower triangle
0307 vfgx=foutin(ix4+1)-foutin(jx4-1);                   % length of upper triangle to the left of the lower post
0308 yx=min(wright(ix4),vfgx);                           % integration length
0309 wx4=ffact(ix4).*ffact(jx4).*yx.^2.*(0.5*vfgx-yx/3)./(wright(ix4).*wleft(jx4)+(yx==0));
0310 %
0311 % now assemble the matrix
0312 %
0313 iox=sort([ix1 ix2 ix3 ix4;jx1 jx2 jx3 jx4]);        % iox(1,:) are output posts, iox(2,:) are input posts
0314 msk=iox(2,:)<=(nfall+nfout)/2;                      % find references to negative input frequencies
0315 iox(2,msk)=(nfall+nfout+1)-iox(2,msk);              % convert negative frequencies to positive
0316 %
0317 % Sort out output gains:
0318 % If output is power then output gain is 1; if output is power/Hz then output gain is 1/area of output filter
0319 %
0320 if ww(6)                                            % ww(6)='x': if lowest filter extended to DC, we added a dummy point at 0Hz, so
0321     iox(1,iox(1,:)==2)=3;                           % merge lowest two output nodes
0322 end
0323 if highex                                           % if highest filter extended, we added a dummy point at Nyquist, so
0324     iox(1,iox(1,:)==nfout-1)=nfout-2;               % merge highest two output nodes
0325 end
0326 x=sparse(iox(1,:)-1-ww(6),max(iox(2,:)-nfout-nf,1),[wx1 wx2 wx3 wx4],pmq,nf);   % forward transformation matrix without input/output gains
0327 gout=full(sum(x,2));                                % area of each output integral
0328 goutd=sparse(1:pmq,1:pmq,(gout+(gout==0)).^(-1));   % create sparse diagonal matrix of output gains
0329 gouti=full(sum(x(:,1+ww(12):end),2));                                % area of each output integral excluding DC if 'z' option given
0330     goutid=sparse(1:pmq,1:pmq,(gouti+(gouti==0)).^(-1));   % create sparse diagonal matrix of output gains
0331 %
0332 % Sort out input gains:
0333 % If input is power then input gain is 1/area; if input is power/Hz then input gain is 1
0334 %
0335 gin=fin(3:nfin)-fin(1:nfin-2);                              % full width of input interpolation filters
0336 gin=2*(gin+(gin==0)).^(-1);                                 % input gain equals 1/area
0337 ginsi=repmat(1+ww(19),1,nf-2);                        % 's' option means all inputs except DC and Nyquist have been doubled
0338 ginsd=sparse(1:nf,1:nf,[1-ww(12) ginsi.^(-1) 1]);                        % ... cancel this out with additional input scaling for forward transform
0339 ginsid=sparse(1:nf,1:nf,[2*(1-ww(12)) ginsi 2]);     % and back again for inverse transform
0340 gind=sparse(1:nf,1:nf,gin(end-nf+1:end));                   % input gains
0341 %
0342 % Now create the x and xi matrices
0343 %
0344 switch 2*ww(16)+ww(15)
0345     case 0                          % '': input and output are both power
0346         xi=ginsid*x'*goutid;
0347         x=x*(gind*ginsd);
0348     case 1                          % 'd': input is power/Hz, output is power
0349         xi=(ginsid*gind)*x'*goutid;
0350         x=x*ginsd;
0351     case 2                          % 'D': input is power, output is power/Hz
0352         xi=ginsid*x';
0353         x=goutd*x*(gind*ginsd);
0354     case 3                              % 'dD': input and output are both power/Hz
0355         xi=(ginsid*gind)*x';
0356         x=goutd*x*ginsd;
0357 end
0358 if ww(20)                               % 'S': double outputs to include negative frequency energy
0359     x=2*x;
0360     xi=0.5*xi;
0361 end
0362 if ww(13)                               % 'Z': DC input is an impulse plus a diffuse component
0363     x(:,2)=x(:,2)+x(:,1)*ginsd(2,2);    % power of diffuse component at DC is equal to that opf adjacent bin corrected for 's' option
0364     x(:,1)=0;                           % Eliminate references to DC input in forward transform only
0365 end
0366 if ww(14)                               % 'q': we need an extra output that replicates the DC component
0367     if ww(12)                           % 'z': DC input is an impulse
0368         x=[sparse(1,1,1,1,nf); x];
0369         xi=[sparse(1,1,1,nf,1) xi];
0370     elseif ww(13)                       % 'Z': DC input is an impulse plus a diffuse component
0371         x=[sparse([1 1],[1 2],[1 -ginsd(2,2)],1,nf); x]; % impulse component is DC input minus adjacent bin corrected for 's' option
0372         xi=[sparse(1,1,1,nf,1) xi];
0373     else
0374         x=[sparse(1,nf); x];            % '': DC input is diffuse as normal
0375         xi=[sparse(nf,1) xi];
0376     end
0377 end
0378 %
0379 % sort out the output argument options
0380 %
0381 if ~ww(5)                                   % output cf in Hz instead of mel/erb/...
0382     cf=[zeros(1,ww(14)) mb(2:pmq+1)];       % ... and include an initial 0 if 'q' option (ww(14)==1)
0383 else                                        % keep cf in mel/erb/...
0384     if ww(14)                               % 'q' (ww(14)==1): we need an extra output for the DC component
0385         if wr==1                            % log-scaled so ...
0386             cf=[-Inf cf(2:p)];              % ... DC corresponds to -Inf
0387         else                                % not log-scaled ...
0388             cf=[0 cf(2:p)];                 % ... DC corresponds to 0
0389         end
0390     else                                    % no 'q' option (ww(14)==0) ...
0391         cf=cf(2:p+1);                       % ... just remove dummy end frequencies
0392     end 
0393 end
0394 if ww(1)                                    % round outputs to the centre of gravity bin
0395     sx2=sum(x,2);                           % sum of each row
0396     msk=full(sx2~=0);
0397     vxc=zeros(pmq,1);
0398     vxc(msk)=round((x(msk,:)*(1:nf)')./sx2(msk));   % find centre of gravity of each row
0399     x=sparse(1:pmq,vxc,sx2,pmq,nf);             % put all the weight into the centre of gravity bin
0400 end
0401 il=1; % default range is entire x maxtrix
0402 ih=nf;
0403 if nargout > 3                                  % if il and/or ih output arguments are specified ...
0404     if nargout==4                               % xi has been omitted ...
0405     msk=full(any(x>0,1));                       % find non-zero columns in x
0406     else                                        % xi output included
0407         msk=full(any(x>0,1) | any(xi>0,2)');    % find non-zero columns in x or rows in xi
0408     end
0409     il=find(msk,1);                             % il is first non-zero column
0410     if ~numel(il)                               % if x is all zeros ...
0411         il=1;                                   % ... set il and ih to 1
0412         ih=1;                              
0413     elseif nargout >3
0414         ih=find(msk,1,'last');                  % ih is last non-zero column
0415     end
0416     x=x(:,il:ih);                               % remove redundant columns from x
0417     if nargout==4                               % xi has been omitted ...
0418         xi=il;                                  % shift the il and ih outputs up by one position
0419         il=ih;
0420     else
0421         xi=xi(il:ih,:);                         % remove redundant rows from xi
0422     end
0423 elseif ww(23)                                   % ww(23)='w': use whole dft
0424     x=[x sparse(p,n-nf)];                       % append zeros onto x
0425     xi=[xi; xi(n-nf+1:-1:2,:)];                 % reflect elements other than the DC and Nyquist
0426 end
0427 %
0428 % plot results if no output arguments or 'g','G' options given
0429 %
0430 if ~nargout || ww(21) || ww(22)                 % plot idealized filters
0431     ww(22)=~ww(21);                             % 'G' option is the default unless 'g' is specified
0432     finax=(il-1:ih-1)*df;                       % input frequency axis
0433     newfig=0;
0434     if  ww(21)
0435         plot(finax,x(:,il:ih)'); 
0436         hold on  
0437         plot(finax,sum(x,1),'--k');
0438         v_axisenlarge([-1 -1.05]);
0439         plot(repmat(mb(2:end-1),2,1),get(gca,'ylim'),':k');
0440         hold off
0441         title(['filtbankm: mode = ''' w '''']);
0442         xlabel(['Frequency (' v_xticksi 'Hz)']);
0443         ylabel('Weight');
0444         newfig=1;
0445     end
0446     if  ww(22)
0447         if newfig
0448             figure;
0449         end
0450         imagesc(finax,1:pmq,x);
0451         axis 'xy'
0452         colorbar;
0453         hold on
0454         ylim=get(gca,'ylim');
0455         plot(repmat(mb(2:end-1),2,1),ylim,':w');
0456         hold off
0457         v_cblabel('Weight');
0458         switch wr
0459             case 1
0460                 type='Log-spaced';
0461             case 2
0462                 type='Erb-spaced';
0463             case 3
0464                 type='Bark-spaced';
0465             case 4
0466                 type='Mel-spaced';
0467             otherwise
0468                 type='Linear-spaced';
0469         end
0470         ylabel([type ' Filter']);
0471         xlabel(['Frequency (' v_xticksi 'Hz)']);
0472         title(['filtbankm: mode = ''' w '''']);
0473     end
0474 
0475 end

Generated by m2html © 2003