v_filtbankm

PURPOSE ^

V_FILTBANKM determine matrix for a linear/mel/erb/bark-spaced v_filterbank [X,IL,IH]=(P,N,FS,FL,FH,W)

SYNOPSIS ^

function [x,cf,xi,il,ih]=v_filtbankm(p,n,fs,fl,fh,w)

DESCRIPTION ^

V_FILTBANKM determine matrix for a linear/mel/erb/bark-spaced v_filterbank [X,IL,IH]=(P,N,FS,FL,FH,W)

 Usage:
 (1) Calcuate the Mel-frequency Cepstral Coefficients

        f=v_rfft(s);                                    % v_rfft() returns only 1+floor(n/2) coefficients
         x=v_filtbankm(p,n,fs,0,fs/2,'m');              % n is the fft length, p is the number of filters wanted
         z=log(x*abs(f).^2);                            % multiply x by the power spectrum
         c=dct(z);                                      % take the DCT

 (2) Calcuate the Mel-frequency Cepstral Coefficients efficiently

        f=fft(s);                                      % n is the fft length, p is the number of filters wanted
        [x,cf,na,nb]=v_filtbankm(p,n,fs,0,fs/2,'m');   % na:nb gives the fft bins that are needed
        z=log(x*(f(na:nb)).*conj(f(na:nb)));           % multiply x by the power spectrum
         c=dct(z);                                      % take the DCT

 (3) Plot the calculated filterbanks as a graph

        plot((0:floor(n/2))*fs/n,v_filtbankm(p,n,fs,0,fs/2,'m')')   % fs=sample frequency

 (4) Plot the calculated filterbanks as a spectrogram

        v_filtbankm(p,n,fs,0,fs/2,'m');

 Inputs:
       p   number of filters in v_filterbank or the filter spacing in k-mel/bark/erb (see 'p' and 'P' options) [ceil(4.6*log10(fs))]
        n   length of dft
        fs  sample rate in Hz
        fl  low end of the lowest filter in Hz (or in mel/erb/bark/log10 with 'h' option) [default = 0Hz or, if 'l' option given, 30Hz]
        fh  high end of highest filter in Hz (or in mel/erb/bark/log10 with 'h' option) [default = fs/2]
        w   any sensible combination of the following:

             'b' = bark scale
             'e' = erb-rate scale
             'l' = log10 Hz frequency scale
             'f' = linear frequency scale [default]
             'm' = mel frequency scale

             'n' = round to the nearest FFT bin so each row of x contains only one non-zero entry

             'c' = fl specifies centre of low filters instead of low edge
             'C' = fh specifies centre of high filter instead of high edge
             'h' = fl & fh are in mel/erb/bark/log10 instead of Hz
             'H' = give cf outputs in mel/erb/bark/log10 instead of Hz

              'x' = lowest filter remains at 1 down to 0 frequency
             'X' = highest filter remains at 1 up to nyquist freqency

             'p' = input p specifies the number of filters [default if p>=1]
             'P' = input p specifies the approximate filter spacing in kHz/kmel/... [default if p<1]

             'z' = Treat input power spectrum at 0Hz as an impulse rather than being diffuse
             'Z' = Treat input power spectrum at 0Hz as the sum of an impulse and a continuous component with the same amlitude as the adjacent bin
             'q' = The first output filter gives the power of the impulse at 0Hz (regardless of the 'D' option). 'zq' ensures exact retention of DC component by zi*z

             'd' = input is power spectral density (power per Hz) instead of power
             'D' = output is power spectral density (power per Hz) instead of power

             's' = single-sided input: do not add power from symmetric negative frequencies (i.e. non-DC/Nyquist inputs have already been doubled)
             'S' = single-sided output: include power from both positive and negative frequencies (this doubles the non-DC/Nyquist outputs)
            ['w' = size(x,2)=size(xi,1)=n rather than 1+floor(n/2) although the rightmost half of x is all zeros] not yet implemented

             'g' = plot filter coefficients as graph
             'G' = plot filter coefficients as spectrogram image [default if no output arguments present]

           Legacy options: 'y'='xX', 'Y'='x', 'yY'='X', 'u'='dD', 'U'='D'

 Outputs:    x(p,k)  a sparse matrix containing the v_filterbank amplitudes
                    If the il and ih output arguments are included then k=ih-il+1 otherwise k=1+floor(n/2)
                   Note that, with the 'S' option, the peak filter values equal 2 to account for the energy in the negative FFT frequencies.
           cf(p)   the v_filterbank centre frequencies in Hz (or in mel/erb/bark/log10 with 'H' option)
           xi(k,p) [optional] sparse matrix that is an approximate inverse of x
            il      the lowest fft bin with a non-zero coefficient
            ih      the highest fft bin with a non-zero coefficient

 The input power will be preserved if the options 'xXS' are given

 The output of the routine is a sparse filterbank matrix. The vector output of the filterbank can then be obtained
 by pre-multiplying an input power spectrum vector by the filterbank matrix. The input and output vectors can optionally
 be in either the power domain or the power spectral density domain.
 The routine implements the filterbank in two conceptual stages (which are merged in the practical implementation):

 Stage 1:
 The discrete input spectrum is converted to a continuous power spectral density using linear interpolation in frequency.
 Each element of the input spectrum influences a frequency interval of width 2d where d is the input frequency bin width.
 The DC component of the input is treated specially in one of three ways: (a) it can be treated as a normal element that
 influences an interval (-d,+d) like the other elements [default]; (b) it can be treated as an impulse at DC ['z' option];
 (c) it can be treated as a mixture of an impulse and a normal component whose value equals that of the adjacent frequency
 bin ['Z' option].

 Stage 2:
 The filterbank outputs are calculated by integrating the product of the continuous spectrum and a filter weight that is
 triangular in the frequency domain. Optionally, the first filterbank preserves the DC impulse component of the continuous
 spectrum ['q' option].

 References:

 [1] S. S. Stevens, J. Volkman, and E. B. Newman. A scale for the measurement
     of the psychological magnitude of pitch. J. Acoust Soc Amer, 8: 185-190, 1937.
 [2] S. Davis and P. Mermelstein. Comparison of parametric representations for
     monosyllabic word recognition in continuously spoken sentences.
     IEEE Trans Acoustics Speech and Signal Processing, 28 (4): 357-366, Aug. 1980.

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SOURCE CODE ^

0001 function [x,cf,xi,il,ih]=v_filtbankm(p,n,fs,fl,fh,w)
0002 %V_FILTBANKM determine matrix for a linear/mel/erb/bark-spaced v_filterbank [X,IL,IH]=(P,N,FS,FL,FH,W)
0003 %
0004 % Usage:
0005 % (1) Calcuate the Mel-frequency Cepstral Coefficients
0006 %
0007 %        f=v_rfft(s);                                    % v_rfft() returns only 1+floor(n/2) coefficients
0008 %         x=v_filtbankm(p,n,fs,0,fs/2,'m');              % n is the fft length, p is the number of filters wanted
0009 %         z=log(x*abs(f).^2);                            % multiply x by the power spectrum
0010 %         c=dct(z);                                      % take the DCT
0011 %
0012 % (2) Calcuate the Mel-frequency Cepstral Coefficients efficiently
0013 %
0014 %        f=fft(s);                                      % n is the fft length, p is the number of filters wanted
0015 %        [x,cf,na,nb]=v_filtbankm(p,n,fs,0,fs/2,'m');   % na:nb gives the fft bins that are needed
0016 %        z=log(x*(f(na:nb)).*conj(f(na:nb)));           % multiply x by the power spectrum
0017 %         c=dct(z);                                      % take the DCT
0018 %
0019 % (3) Plot the calculated filterbanks as a graph
0020 %
0021 %        plot((0:floor(n/2))*fs/n,v_filtbankm(p,n,fs,0,fs/2,'m')')   % fs=sample frequency
0022 %
0023 % (4) Plot the calculated filterbanks as a spectrogram
0024 %
0025 %        v_filtbankm(p,n,fs,0,fs/2,'m');
0026 %
0027 % Inputs:
0028 %       p   number of filters in v_filterbank or the filter spacing in k-mel/bark/erb (see 'p' and 'P' options) [ceil(4.6*log10(fs))]
0029 %        n   length of dft
0030 %        fs  sample rate in Hz
0031 %        fl  low end of the lowest filter in Hz (or in mel/erb/bark/log10 with 'h' option) [default = 0Hz or, if 'l' option given, 30Hz]
0032 %        fh  high end of highest filter in Hz (or in mel/erb/bark/log10 with 'h' option) [default = fs/2]
0033 %        w   any sensible combination of the following:
0034 %
0035 %             'b' = bark scale
0036 %             'e' = erb-rate scale
0037 %             'l' = log10 Hz frequency scale
0038 %             'f' = linear frequency scale [default]
0039 %             'm' = mel frequency scale
0040 %
0041 %             'n' = round to the nearest FFT bin so each row of x contains only one non-zero entry
0042 %
0043 %             'c' = fl specifies centre of low filters instead of low edge
0044 %             'C' = fh specifies centre of high filter instead of high edge
0045 %             'h' = fl & fh are in mel/erb/bark/log10 instead of Hz
0046 %             'H' = give cf outputs in mel/erb/bark/log10 instead of Hz
0047 %
0048 %              'x' = lowest filter remains at 1 down to 0 frequency
0049 %             'X' = highest filter remains at 1 up to nyquist freqency
0050 %
0051 %             'p' = input p specifies the number of filters [default if p>=1]
0052 %             'P' = input p specifies the approximate filter spacing in kHz/kmel/... [default if p<1]
0053 %
0054 %             'z' = Treat input power spectrum at 0Hz as an impulse rather than being diffuse
0055 %             'Z' = Treat input power spectrum at 0Hz as the sum of an impulse and a continuous component with the same amlitude as the adjacent bin
0056 %             'q' = The first output filter gives the power of the impulse at 0Hz (regardless of the 'D' option). 'zq' ensures exact retention of DC component by zi*z
0057 %
0058 %             'd' = input is power spectral density (power per Hz) instead of power
0059 %             'D' = output is power spectral density (power per Hz) instead of power
0060 %
0061 %             's' = single-sided input: do not add power from symmetric negative frequencies (i.e. non-DC/Nyquist inputs have already been doubled)
0062 %             'S' = single-sided output: include power from both positive and negative frequencies (this doubles the non-DC/Nyquist outputs)
0063 %            ['w' = size(x,2)=size(xi,1)=n rather than 1+floor(n/2) although the rightmost half of x is all zeros] not yet implemented
0064 %
0065 %             'g' = plot filter coefficients as graph
0066 %             'G' = plot filter coefficients as spectrogram image [default if no output arguments present]
0067 %
0068 %           Legacy options: 'y'='xX', 'Y'='x', 'yY'='X', 'u'='dD', 'U'='D'
0069 %
0070 % Outputs:    x(p,k)  a sparse matrix containing the v_filterbank amplitudes
0071 %                    If the il and ih output arguments are included then k=ih-il+1 otherwise k=1+floor(n/2)
0072 %                   Note that, with the 'S' option, the peak filter values equal 2 to account for the energy in the negative FFT frequencies.
0073 %           cf(p)   the v_filterbank centre frequencies in Hz (or in mel/erb/bark/log10 with 'H' option)
0074 %           xi(k,p) [optional] sparse matrix that is an approximate inverse of x
0075 %            il      the lowest fft bin with a non-zero coefficient
0076 %            ih      the highest fft bin with a non-zero coefficient
0077 %
0078 % The input power will be preserved if the options 'xXS' are given
0079 %
0080 % The output of the routine is a sparse filterbank matrix. The vector output of the filterbank can then be obtained
0081 % by pre-multiplying an input power spectrum vector by the filterbank matrix. The input and output vectors can optionally
0082 % be in either the power domain or the power spectral density domain.
0083 % The routine implements the filterbank in two conceptual stages (which are merged in the practical implementation):
0084 %
0085 % Stage 1:
0086 % The discrete input spectrum is converted to a continuous power spectral density using linear interpolation in frequency.
0087 % Each element of the input spectrum influences a frequency interval of width 2d where d is the input frequency bin width.
0088 % The DC component of the input is treated specially in one of three ways: (a) it can be treated as a normal element that
0089 % influences an interval (-d,+d) like the other elements [default]; (b) it can be treated as an impulse at DC ['z' option];
0090 % (c) it can be treated as a mixture of an impulse and a normal component whose value equals that of the adjacent frequency
0091 % bin ['Z' option].
0092 %
0093 % Stage 2:
0094 % The filterbank outputs are calculated by integrating the product of the continuous spectrum and a filter weight that is
0095 % triangular in the frequency domain. Optionally, the first filterbank preserves the DC impulse component of the continuous
0096 % spectrum ['q' option].
0097 %
0098 % References:
0099 %
0100 % [1] S. S. Stevens, J. Volkman, and E. B. Newman. A scale for the measurement
0101 %     of the psychological magnitude of pitch. J. Acoust Soc Amer, 8: 185-190, 1937.
0102 % [2] S. Davis and P. Mermelstein. Comparison of parametric representations for
0103 %     monosyllabic word recognition in continuously spoken sentences.
0104 %     IEEE Trans Acoustics Speech and Signal Processing, 28 (4): 357-366, Aug. 1980.
0105 
0106 % Bugs/Suggestions
0107 % (1) default frequencies won't work if the h option is specified
0108 % (2) low default frequency is invalid if the 'l' option is specified
0109 % (3) Add 'z' option to include a DC output as the first coefficient
0110 % (4) Add 'Z' option to ignore the DC input component
0111 % (5) Add 'i' option to calculate the inverse of x instead
0112 % (6) Add option to choose the domain in which linear interpolation is performed
0113 
0114 %      Copyright (C) Mike Brookes 1997-2024
0115 %      Version: $Id: v_filtbankm.m $
0116 %
0117 %   VOICEBOX is a MATLAB toolbox for speech processing.
0118 %   Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html
0119 %
0120 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0121 %   This program is free software; you can redistribute it and/or modify
0122 %   it under the terms of the GNU General Public License as published by
0123 %   the Free Software Foundation; either version 2 of the License, or
0124 %   (at your option) any later version.
0125 %
0126 %   This program is distributed in the hope that it will be useful,
0127 %   but WITHOUT ANY WARRANTY; without even the implied warranty of
0128 %   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
0129 %   GNU General Public License for more details.
0130 %
0131 %   You can obtain a copy of the GNU General Public License from
0132 %   http://www.gnu.org/copyleft/gpl.html or by writing to
0133 %   Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA.
0134 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0135 
0136 % Notes:
0137 % (1) In the comments, "FFT bin_0" assumes DC = bin 0 whereas "FFT bin_1" means DC = bin 1nfout
0138 % (2) "input" and "output" need to be interchanged if the 'i' option is given
0139 
0140 if nargin<6 || isempty(w)           % if no mode option, w, is specified
0141     w='f';                          % default mode option: 'f' = linear output frequency scale
0142 end
0143 wr=max(any(repmat('lebm',length(w),1)==repmat(w',1,4),1).*(1:4));           % output warping: 0=linear,1=log,2=erbrate,3=bark,4=mel
0144 ww=any(repmat('ncChHxXyYpPzZqdDuUsSgG',length(w),1)==repmat(w',1,22),1);    % decode all other options
0145 % ww elements: 1=n,2=c,3=C,4=h,5=H,6=x,7=X,8=y,9=Y,10=p,11=P,12=z,13=Z,14=q,15=d,16=D,17=u,18=U,19=s,20=S,21=g,22=G
0146 % Convert legacy option codes: 'y'='xX', 'Y'='x', 'yY'='X', 'u'='dD', 'U'='D'
0147 ww(6)=ww(8) ~= ww(9);               % convert 'y' or 'Y' (but not both) to 'x'; extend low frequencies
0148 ww(7)=ww(8);                        % convert 'y' to 'X'; extend high frequencies
0149 ww(15)=ww(17);                      % convert 'u' to 'd'; input is also in power spectral density
0150 ww(16)=ww(17) || ww(18);            % convert 'u' or 'U' to 'd'; output is in power spectral density
0151 flhconv=repmat(wr>0 && ~ww(4),1,2); % flag indicating need to convert filterbank limits from Hz to mel/erb/bark/log10
0152 if nargin < 4 || isempty(fl)
0153     fl=30*(wr==1);                  % lower limit is 0 Hz unless 'l' option specified, in which case it is 30 Hz
0154     flhconv(1)=wr>0;
0155 end
0156 if nargin < 5 || isempty(fh)
0157     fh=0.5*fs;                      % max freq is the nyquist frequency
0158     flhconv(2)=wr>0;
0159 end
0160 %
0161 % Sort out input frequency bins
0162 %
0163 if numel(n)>1
0164     error('non-standard input frequency spacing no longer supported');
0165 else                                % n gives dft length
0166     nf=1+floor(n/2);                % number of input positive-frequency bins from DFT
0167     df=fs/n;                        % input frequency bin spacing
0168 end
0169 %
0170 % Sort out output frequency bins
0171 %
0172 mflh=[fl fh];                       % low and high limits of filterbank triangular filters
0173 if any(flhconv)                     % convert mflh from Hz to mel/erb/... unless already converted via 'h' option
0174     switch wr
0175         case 1                      % 'l' = log scaled
0176             if fl<=0
0177                 error('Low frequency limit must be >0 for ''l'' log10-frequency option');
0178             end
0179             mflh(flhconv)=log10(mflh(flhconv));         % convert frequency limits into log10 Hz
0180         case 2                                          % 'e' = erb-rate scaled
0181             mflh(flhconv)=v_frq2erb(mflh(flhconv));     % convert frequency limits into erb-rate
0182         case 3                                          % 'b' = bark scaled
0183             mflh(flhconv)=v_frq2bark(mflh(flhconv));    % convert frequency limits into bark
0184         case 4                                          % 'm' = mel scaled
0185             mflh(flhconv)=v_frq2mel(mflh(flhconv));     % convert frequency limits into mel
0186     end
0187 end
0188 melrng=mflh(2)-mflh(1);                                 % filterbank range in Hz/mel/erb/...
0189 if isempty(p)
0190     p=ceil(4.6*log10(2*(nf-1)*df));                     % default number of output filters
0191 end
0192 puc=ww(11) || (p<1) && ~ww(10);                         % input p specifies the filter spacing rather than the number of filters
0193 if puc
0194     p=round(melrng/(p*1000))+ww(2)+ww(3)-1+ww(14);      % p now gives the number of filters (excluding DC impulse)
0195 end
0196 melinc=melrng/(p+ww(2)+ww(3)+1-ww(14));                 % inter-filter increment in mel
0197 mflh=mflh+[-ww(2) ww(3)]*melinc;                        % update mflh to include the full width of all filters
0198 %
0199 % Calculate the output centre frequencies in Hz including dummy end points
0200 %
0201 pmq=p-ww(14);                                       % number of filters excluding the one for the DC impulse
0202 cf=mflh(1)+(0:pmq+1)*melinc;                        % centre frequencies in mel/erb/... including dummy ends
0203 cf(2:end)=max(cf(2:end),0);                         % only the first point can be negative [*** doesn't make sense for log scale ***]
0204 switch wr                                           % convert centre frequencies to Hz from mel/erb/...
0205     case 1                                          % 'l' = log scaled
0206         mb=10.^(cf);
0207     case 2                                          % 'e' = erb-rate scaled
0208         mb=v_erb2frq(cf);
0209     case 3                                          % 'b' = bark scaled
0210         mb=v_bark2frq(cf);
0211     case 4                                          % 'm' = mel scaled
0212         mb=v_mel2frq(cf);
0213     otherwise                                       % [default] = linear scaled; no conversion needed
0214         mb=cf;
0215 end
0216 %
0217 % sort out 2-sided input frequencies
0218 %
0219 fin=(-nf:nf)*df;                                    % reflect negative frequencies excluding DC
0220 nfin=length(fin);                                   % length of extended input frequency list         [nfin=2*nf+1]
0221 %
0222 % now sort out the list of output frequencies
0223 %
0224 fout=mb;                                            % output centre frequencies in Hz including dummy values at each end
0225 lowex=ww(6);                                        % extend to 0 Hz if 'x' given
0226 highex=ww(7) && (fout(end-1)<fin(end));             % extend at high end if 'X' specified and final centre frequency < Nyquist
0227 if lowex                                            % extend first filter at low end to DC
0228     fout=[0 0 fout(2:end)];                         % ... add two dummy values at DC instead of previous single dummy value
0229 end
0230 if highex                                           % extend last filter at high end to Nyquist
0231     fout=[fout(1:end-1) fin(end) fin(end)];         % ... add two dummy values at Nyquist instead of previous single dummy value
0232 end
0233 fout=min(fout,fs/2);                                % limit output filters to Nyquist frequency
0234 nfout=length(fout);                                 % number of output filters including one or two dummy points at each end
0235 foutin=[fout fin];
0236 nfall=length(foutin);                               % = nfout + nfin
0237 wleft=[0 fout(2:nfout)-fout(1:nfout-1) 0 fin(2:nfin)-fin(1:nfin-1)]; % width of lower triangle attached to each node
0238 wright=[wleft(2:end) 0];                            % width of upper triangle attached to each node
0239 ffact=[0 ones(1,nfout-2) 0 0 ones(1,2*nf-1) 0];     % valid triangle posts
0240 ffact(wleft+wright==0)=0;                           % disable null width triangles (*** probably unnecessary if all frequencies are distinct ***)
0241 [fall,ifall]=sort(foutin);                          % fall is sorted frequencies with fall=foutin(ifall)
0242 jfall=zeros(1,nfall);                               % create inverse index ...
0243 infall=1:nfall;                                     % ...
0244 jfall(ifall)=infall;                                % ... inverse-index satisfying foutin=fall(jfall)
0245 ffact(ifall([1:max(jfall(1),jfall(nfout+1))-2 min(jfall(nfout),jfall(nfall))+2:nfall]))=0;  % zap input nodes that lie outside the output filters
0246 nxto=cumsum(ifall<=nfout);                          % next output node to the right (or equal) to each node
0247 nxti=cumsum(ifall>nfout);                           % number of input nodes to the left (or equal) to each node
0248 nxtr=min(nxti+1+nfout,nfall);                       % next input node to the right of each value (or nfall if none)
0249 nxtr(ifall>nfout)=1+nxto(ifall>nfout);              % next post to the right of opposite input/output type (using sorted indexes)
0250 nxtr=nxtr(jfall);                                   % next post to the right of opposite input/output type (converted to unsorted indices) or if none: nfall or (nfout+1)
0251 %
0252 % The interpolated spectrum at any frequency can be expressed as the sum of the values at the adjacent input bins
0253 % multiplied by triangular weights that decreases from 1 to 0 between the two bins. The value at an output bin
0254 % is equal to the integral of the interpolated spectrum multiplied by a triangular weight that decreases from
0255 % 1 to 0 either side of the output bin. Thus, if all input/output bins are sorted into ascending order, the
0256 % interval between two adjacent bins contains four partial triangles (a.k.a. trapeziums): two "lower" triangles
0257 % that increase with frequency and two "upper" triangles that decrease with frequency. We need to integrate the
0258 % resultant four input-output trapezium products and add the integrals onto the sum for the appropriate output bins.
0259 % Each triangle has a "post" at one end and is zero at the other end; we enumerate the triangle pairs by pairing
0260 % all input and output triangles with the first available triangle of the other type (i.e. output or input) whose
0261 % rightmost node is to the right of the entire first triangle.
0262 %
0263 % The general result for integrating the product of two trapesiums with
0264 % heights (a,b) and (c,d) over a width x is (ad+bc+2bd+2ac)*x/6
0265 %
0266 % integrate product of lower triangles whose posts (and rightmost nodes) are ix1 and jx1
0267 %
0268 msk0=(ffact>0);                                     % posts with a non-zero magnitude
0269 msk=msk0 & (ffact(nxtr)>0);                         % select triangle pairs with both posts having non-zero magnitudes
0270 ix1=infall(msk);                                    % unsorted indices of leftmost post of pair
0271 jx1=nxtr(msk);                                      % unsorted indices of rightmost post of pair
0272 vfgx=foutin(ix1)-foutin(jx1-1);                     % portion of triangle attached to rightmost post that lies to the left of the leftmost post
0273 yx=min(wleft(ix1),vfgx);                            % integration length. Maybe more efficient: dfall=diff(fall); yx=dfall(jfall(ix1)-1)
0274 wx1=ffact(ix1).*ffact(jx1).*yx.*(wleft(ix1).*vfgx-yx.*(0.5*(wleft(ix1)+vfgx)-yx/3))./(wleft(ix1).*wleft(jx1)+(yx==0));
0275 
0276 % integrate product of upper triangles whose posts are ix2 and jx2 and whose rightmost nodes are ix2+1 and jx2+1
0277 
0278 nxtu=max([nxtr(2:end)-1 0],1);                      % post of the upper triangle of opposite type whose rightmost end is to the right of this triangle's rightmost end
0279 msk=msk0 & (ffact(nxtu)>0);                         % select triangle pairs with both posts having non-zero magnitudes
0280 ix2=infall(msk);                                    % unsorted indices of leftmost post of pair
0281 jx2=nxtu(msk);                                      % unsorted indices of rightmost post of pair
0282 vfgx=foutin(ix2+1)-foutin(jx2);                     % length of left triangle to the right of the right post
0283 yx=min(wright(ix2),vfgx);                           % integration length
0284 yx(foutin(jx2+1)<foutin(ix2+1))=0;                  % zap invalid triangles where the rightmost ends are in the wrong order
0285 wx2=ffact(ix2).*ffact(jx2).*yx.^2.*((0.5*(wright(jx2)-vfgx)+yx/3))./(wright(ix2).*wright(jx2)+(yx==0));
0286 
0287 % integrate lower triangle and upper triangle that ends to its right
0288 
0289 nxtu=max(nxtr-1,1);                                 % post of the upper triangle of opposite type whose rightmost end is to the right of this triangle's post
0290 msk=msk0 & (ffact(nxtu)>0);                         % select triangle pairs with both posts having non-zero magnitudes
0291 ix3=infall(msk);                                    % unsorted indices of lower triangle
0292 jx3=nxtu(msk);                                      % unsorted indices of upper triangle
0293 vfgx=foutin(ix3)-foutin(jx3);                       % length of upper triangle to the left of the lower post
0294 yx=min(wleft(ix3),vfgx);                            % integration length
0295 yx(foutin(jx3+1)<foutin(ix3))=0;                    % zap invalid triangles where the rightmost ends are in the wrong order
0296 wx3=ffact(ix3).*ffact(jx3).*yx.*(wleft(ix3).*(wright(jx3)-vfgx)+yx.*(0.5*(wleft(ix3)-wright(jx3)+vfgx)-yx/3))./(wleft(ix3).*wright(jx3)+(yx==0));
0297 
0298 % integrate upper triangle and lower triangle that starts to its right
0299 
0300 nxtu=[nxtr(2:end) 1];
0301 msk=msk0 & (ffact(nxtu)>0);                         % select triangle pairs with both posts having non-zero magnitudes
0302 ix4=infall(msk);                                    % unsorted indices of upper triangle
0303 jx4=nxtu(msk);                                      % unsorted indices of lower triangle
0304 vfgx=foutin(ix4+1)-foutin(jx4-1);                   % length of upper triangle to the left of the lower post
0305 yx=min(wright(ix4),vfgx);                           % integration length
0306 wx4=ffact(ix4).*ffact(jx4).*yx.^2.*(0.5*vfgx-yx/3)./(wright(ix4).*wleft(jx4)+(yx==0));
0307 %
0308 % now assemble the matrix
0309 %
0310 iox=sort([ix1 ix2 ix3 ix4;jx1 jx2 jx3 jx4]);        % iox(1,:) are output posts, iox(2,:) are input posts
0311 msk=iox(2,:)<=(nfall+nfout)/2;                      % find references to negative input frequencies
0312 iox(2,msk)=(nfall+nfout+1)-iox(2,msk);              % convert negative frequencies to positive
0313 %
0314 % deal with impluses at DC
0315 %
0316 % if ww(13)                   % 'Z': DC input is an impulse plus a diffuse component
0317 %     iox(2,iox(2,:)==nfout+nf+1)=nfout+nf+2; % change references to the DC input to the adjacent bin
0318 % end
0319 %
0320 % Sort out output gains:
0321 % If output is power then output gain is 1; if output is power/Hz then output gain is 1/area of output filter
0322 %
0323 if lowex                                            % if lowest filter extended to DC, we added a dummy point at 0Hz, so
0324     iox(1,iox(1,:)==2)=3;                           % merge lowest two output nodes
0325 end
0326 if highex                                           % if highest filter extended, we added a dummy point at Nyquist, so
0327     iox(1,iox(1,:)==nfout-1)=nfout-2;               % merge highest two output nodes
0328 end
0329 x=sparse(iox(1,:)-1-lowex,max(iox(2,:)-nfout-nf,1),[wx1 wx2 wx3 wx4],pmq,nf);   % forward transformation matrix without input/output gains
0330 gout=full(sum(x,2));                                % area of each output integral
0331 goutd=sparse(1:pmq,1:pmq,(gout+(gout==0)).^(-1));   % create sparse diagonal matrix of output gains
0332 %
0333 % Sort out input gains:
0334 % If input is power then input gain is 1/area; if input is power/Hz then input gain is 1
0335 %
0336 gin=fin(3:nfin)-fin(1:nfin-2);                              % full width of input interpolation filters
0337 gin=2*(gin+(gin==0)).^(-1);                                 % input gain equals 1/area
0338 ginsi=repmat(1+ww(19),1,nf-2);                        % 's' option means all inputs except DC and Nyquist have been doubled
0339 ginsd=sparse(1:nf,1:nf,[1-ww(12) ginsi.^(-1) 1]);                        % ... cancel this out with additional input scaling for forward transform
0340 ginsid=sparse(1:nf,1:nf,[2*(1-ww(12)) ginsi 2]);     % and back again for inverse transform
0341 gind=sparse(1:nf,1:nf,gin(end-nf+1:end));                   % input gains
0342 %
0343 % Now create the x and xi matrices
0344 %
0345 switch 2*ww(16)+ww(15)
0346     case 0                          % '': input and output are both power
0347         xi=ginsid*x'*goutd;
0348         x=x*(gind*ginsd);
0349     case 1                          % 'd': input is power/Hz, output is power
0350         xi=(ginsid*gind)*x'*goutd;
0351         x=x*ginsd;
0352     case 2                          % 'D': input is power, output is power/Hz
0353         xi=ginsid*x';
0354         x=goutd*x*(gind*ginsd);
0355     case 3                              % 'dD': input and output are both power/Hz
0356         xi=(ginsid*gind)*x';
0357         x=goutd*x*ginsd;
0358 end
0359 if ww(20)                               % 'S': double outputs to include negative frequency energy
0360     x=2*x;
0361     xi=0.5*xi;
0362 end
0363 if ww(13)                               % 'Z': DC input is an impulse plus a diffuse component
0364     x(:,2)=x(:,2)+x(:,1)*ginsd(2,2);    % power of diffuse component at DC is equal to that opf adjacent bin corrected for 's' option
0365     x(:,1)=0;                           % Eliminate references to DC input in forward transform only
0366 end
0367 if ww(14)                               % 'q': we need an extra output that replicates the DC component
0368     if ww(12)                           % 'z': DC input is an impulse
0369         x=[sparse(1,1,1,1,nf); x];
0370         xi=[sparse(1,1,1,nf,1) xi];
0371     elseif ww(13)                       % 'Z': DC input is an impulse plus a diffuse component
0372         x=[sparse([1 1],[1 2],[1 -ginsd(2,2)],1,nf); x]; % impulse component is DC input minus adjacent bin corrected for 's' option
0373         xi=[sparse(1,1,1,nf,1) xi];
0374     else
0375         x=[sparse(1,nf); x];            % '': DC input is diffuse as normal
0376         xi=[sparse(nf,1) xi];
0377     end
0378 end
0379 %
0380 % sort out the output argument options
0381 %
0382 if ~ww(5)                                   % output cf in Hz instead of mel/erb/...
0383     cf=[zeros(1,ww(14)) mb(2:pmq+1)];       % ... and include an initial 0 if 'q' option (ww(14)==1)
0384 else                                        % keep cf in mel/erb/...
0385     if ww(14)                               % 'q' (ww(14)==1): we need an extra output for the DC component
0386         if wr==1                            % log-scaled so ...
0387             cf=[-Inf cf(2:p)];              % ... DC corresponds to -Inf
0388         else                                % not log-scaled ...
0389             cf=[0 cf(2:p)];                 % ... DC corresponds to 0
0390         end
0391     else                                    % no 'q' option (ww(14)==0) ...
0392         cf=cf(2:p+1);                       % ... just remove dummy end frequencies
0393     end 
0394 end
0395 il=1;
0396 ih=nf;
0397 if ww(1)                                    % round outputs to the centre of gravity bin
0398     sx2=sum(x,2);                           % sum of each row
0399     msk=full(sx2~=0);
0400     vxc=zeros(pmq,1);
0401     vxc(msk)=round((x(msk,:)*(1:nf)')./sx2(msk));   % find centre of gravity of each row
0402     x=sparse(1:pmq,vxc,sx2,pmq,nf);             % put all the weight into the centre of gravity bin
0403 end
0404 if nargout > 3                              % if il and/or ih output arguments are specified ...
0405     msk=full(any(x>0,1));                   % find non-zero columns
0406     il=find(msk,1);                         % il is first non-zero column
0407     if ~numel(il)                           % if x is all zeros ...
0408         ih=1;                               % ... set ih to 1 ********   should we also set il to one?
0409     elseif nargout >3
0410         ih=find(msk,1,'last');              % ih is last non-zero column
0411     end
0412     x=x(:,il:ih);                           % remove redundant columns from x
0413     if nargout==4                           % xi has been omitted ...
0414         xi=il;                              % shift the il and ih outputs up by one position
0415         il=ih;
0416     else
0417         xi=x(il:ih,:);                      % remove redundant rows from xi
0418     end
0419 end
0420 % This bit commented out because it should be redundant
0421 % if ww(15) && ww(16)               % if input and output are both in power per Hz
0422 %     sx=sum(x,2);
0423 %     x=x./repmat(sx+(sx==0),1,size(x,2));    % force rows to sum to 1 (should be true anyway)
0424 % end
0425 %
0426 % plot results if no output arguments or 'g','G' options given
0427 %
0428 if ~nargout || ww(21) || ww(22)                 % plot idealized filters
0429     ww(22)=~ww(21);                             % 'G' option is the default unless 'g' is specified
0430     finax=(il-1:ih-1)*df;                       % input frequency axis
0431     newfig=0;
0432     if  ww(21)
0433         plot(finax,x'); 
0434         hold on  
0435         plot(finax,sum(x,1),'--k');
0436         v_axisenlarge([-1 -1.05]);
0437         plot(repmat(mb(2:end-1),2,1),get(gca,'ylim'),':k');
0438         hold off
0439         title(['filtbankm: mode = ''' w '''']);
0440         xlabel(['Frequency (' v_xticksi 'Hz)']);
0441         ylabel('Weight');
0442         newfig=1;
0443     end
0444     if  ww(22)
0445         if newfig
0446             figure;
0447         end
0448         imagesc(finax,1:pmq,x);
0449         axis 'xy'
0450         colorbar;
0451         hold on
0452         ylim=get(gca,'ylim');
0453         plot(repmat(mb(2:end-1),2,1),ylim,':w');
0454         hold off
0455         v_cblabel('Weight');
0456         switch wr
0457             case 1
0458                 type='Log-spaced';
0459             case 2
0460                 type='Erb-spaced';
0461             case 3
0462                 type='Bark-spaced';
0463             case 4
0464                 type='Mel-spaced';
0465             otherwise
0466                 type='Linear-spaced';
0467         end
0468         ylabel([type ' Filter']);
0469         xlabel(['Frequency (' v_xticksi 'Hz)']);
0470         title(['filtbankm: mode = ''' w '''']);
0471     end
0472 
0473 end

Generated by m2html © 2003