V_FILTBANKM determine matrix for a linear/mel/erb/bark-spaced v_filterbank [X,IL,IH]=(P,N,FS,FL,FH,W) Usage: (1) Calcuate the Mel-frequency Cepstral Coefficients f=v_rfft(s); % v_rfft() returns only 1+floor(n/2) coefficients x=v_filtbankm(p,n,fs,0,fs/2,'m'); % n is the fft length, p is the number of filters wanted z=log(x*abs(f).^2); % multiply x by the power spectrum c=dct(z); % take the DCT (2) Calcuate the Mel-frequency Cepstral Coefficients efficiently f=fft(s); % n is the fft length, p is the number of filters wanted [x,cf,na,nb]=v_filtbankm(p,n,fs,0,fs/2,'m'); % na:nb gives the fft bins that are needed z=log(x*(f(na:nb)).*conj(f(na:nb))); % multiply x by the power spectrum c=dct(z); % take the DCT (3) Plot the calculated filterbanks as a graph plot((0:floor(n/2))*fs/n,v_filtbankm(p,n,fs,0,fs/2,'m')') % fs=sample frequency (4) Plot the calculated filterbanks as a spectrogram v_filtbankm(p,n,fs,0,fs/2,'m'); Inputs: p number of filters in v_filterbank or the filter spacing in k-mel/bark/erb (see 'p' and 'P' options) [ceil(4.6*log10(fs))] n length of dft fs sample rate in Hz fl low end of the lowest filter in Hz (or in mel/erb/bark/log10 with 'h' option) [default = 0Hz or, if 'l' option given, 30Hz] fh high end of highest filter in Hz (or in mel/erb/bark/log10 with 'h' option) [default = fs/2] w any sensible combination of the following: 'b' = bark scale 'e' = erb-rate scale 'l' = log10 Hz frequency scale 'f' = linear frequency scale [default] 'm' = mel frequency scale 'n' = round to the nearest FFT bin so each row of x contains only one non-zero entry 'c' = fl specifies centre of low filters instead of low edge 'C' = fh specifies centre of high filter instead of high edge 'h' = fl & fh are in mel/erb/bark/log10 instead of Hz 'H' = give cf outputs in mel/erb/bark/log10 instead of Hz 'x' = lowest filter remains at 1 down to 0 frequency 'X' = highest filter remains at 1 up to nyquist freqency 'p' = input p specifies the number of filters [default if p>=1] 'P' = input p specifies the approximate filter spacing in kHz/kmel/... [default if p<1] 'z' = Treat input power spectrum at 0Hz as an impulse rather than being diffuse 'Z' = Treat input power spectrum at 0Hz as the sum of an impulse and a continuous component with the same amlitude as the adjacent bin 'q' = The first output filter gives the power of the impulse at 0Hz (regardless of the 'D' option). 'zq' ensures exact retention of DC component by zi*z 'd' = input is power spectral density (power per Hz) instead of power 'D' = output is power spectral density (power per Hz) instead of power 's' = single-sided input: do not add power from symmetric negative frequencies (i.e. non-DC/Nyquist inputs have already been doubled) 'S' = single-sided output: include power from both positive and negative frequencies (this doubles the non-DC/Nyquist outputs) ['w' = size(x,2)=size(xi,1)=n rather than 1+floor(n/2) although the rightmost half of x is all zeros] not yet implemented 'g' = plot filter coefficients as graph 'G' = plot filter coefficients as spectrogram image [default if no output arguments present] Legacy options: 'y'='xX', 'Y'='x', 'yY'='X', 'u'='dD', 'U'='D' Outputs: x(p,k) a sparse matrix containing the v_filterbank amplitudes If the il and ih output arguments are included then k=ih-il+1 otherwise k=1+floor(n/2) Note that, with the 'S' option, the peak filter values equal 2 to account for the energy in the negative FFT frequencies. cf(p) the v_filterbank centre frequencies in Hz (or in mel/erb/bark/log10 with 'H' option) xi(k,p) [optional] sparse matrix that is an approximate inverse of x il the lowest fft bin with a non-zero coefficient ih the highest fft bin with a non-zero coefficient The input power will be preserved if the options 'xXS' are given The output of the routine is a sparse filterbank matrix. The vector output of the filterbank can then be obtained by pre-multiplying an input power spectrum vector by the filterbank matrix. The input and output vectors can optionally be in either the power domain or the power spectral density domain. The routine implements the filterbank in two conceptual stages (which are merged in the practical implementation): Stage 1: The discrete input spectrum is converted to a continuous power spectral density using linear interpolation in frequency. Each element of the input spectrum influences a frequency interval of width 2d where d is the input frequency bin width. The DC component of the input is treated specially in one of three ways: (a) it can be treated as a normal element that influences an interval (-d,+d) like the other elements [default]; (b) it can be treated as an impulse at DC ['z' option]; (c) it can be treated as a mixture of an impulse and a normal component whose value equals that of the adjacent frequency bin ['Z' option]. Stage 2: The filterbank outputs are calculated by integrating the product of the continuous spectrum and a filter weight that is triangular in the frequency domain. Optionally, the first filterbank preserves the DC impulse component of the continuous spectrum ['q' option]. References: [1] S. S. Stevens, J. Volkman, and E. B. Newman. A scale for the measurement of the psychological magnitude of pitch. J. Acoust Soc Amer, 8: 185-190, 1937. [2] S. Davis and P. Mermelstein. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans Acoustics Speech and Signal Processing, 28 (4): 357-366, Aug. 1980.
0001 function [x,cf,xi,il,ih]=v_filtbankm(p,n,fs,fl,fh,w) 0002 %V_FILTBANKM determine matrix for a linear/mel/erb/bark-spaced v_filterbank [X,IL,IH]=(P,N,FS,FL,FH,W) 0003 % 0004 % Usage: 0005 % (1) Calcuate the Mel-frequency Cepstral Coefficients 0006 % 0007 % f=v_rfft(s); % v_rfft() returns only 1+floor(n/2) coefficients 0008 % x=v_filtbankm(p,n,fs,0,fs/2,'m'); % n is the fft length, p is the number of filters wanted 0009 % z=log(x*abs(f).^2); % multiply x by the power spectrum 0010 % c=dct(z); % take the DCT 0011 % 0012 % (2) Calcuate the Mel-frequency Cepstral Coefficients efficiently 0013 % 0014 % f=fft(s); % n is the fft length, p is the number of filters wanted 0015 % [x,cf,na,nb]=v_filtbankm(p,n,fs,0,fs/2,'m'); % na:nb gives the fft bins that are needed 0016 % z=log(x*(f(na:nb)).*conj(f(na:nb))); % multiply x by the power spectrum 0017 % c=dct(z); % take the DCT 0018 % 0019 % (3) Plot the calculated filterbanks as a graph 0020 % 0021 % plot((0:floor(n/2))*fs/n,v_filtbankm(p,n,fs,0,fs/2,'m')') % fs=sample frequency 0022 % 0023 % (4) Plot the calculated filterbanks as a spectrogram 0024 % 0025 % v_filtbankm(p,n,fs,0,fs/2,'m'); 0026 % 0027 % Inputs: 0028 % p number of filters in v_filterbank or the filter spacing in k-mel/bark/erb (see 'p' and 'P' options) [ceil(4.6*log10(fs))] 0029 % n length of dft 0030 % fs sample rate in Hz 0031 % fl low end of the lowest filter in Hz (or in mel/erb/bark/log10 with 'h' option) [default = 0Hz or, if 'l' option given, 30Hz] 0032 % fh high end of highest filter in Hz (or in mel/erb/bark/log10 with 'h' option) [default = fs/2] 0033 % w any sensible combination of the following: 0034 % 0035 % 'b' = bark scale 0036 % 'e' = erb-rate scale 0037 % 'l' = log10 Hz frequency scale 0038 % 'f' = linear frequency scale [default] 0039 % 'm' = mel frequency scale 0040 % 0041 % 'n' = round to the nearest FFT bin so each row of x contains only one non-zero entry 0042 % 0043 % 'c' = fl specifies centre of low filters instead of low edge 0044 % 'C' = fh specifies centre of high filter instead of high edge 0045 % 'h' = fl & fh are in mel/erb/bark/log10 instead of Hz 0046 % 'H' = give cf outputs in mel/erb/bark/log10 instead of Hz 0047 % 0048 % 'x' = lowest filter remains at 1 down to 0 frequency 0049 % 'X' = highest filter remains at 1 up to nyquist freqency 0050 % 0051 % 'p' = input p specifies the number of filters [default if p>=1] 0052 % 'P' = input p specifies the approximate filter spacing in kHz/kmel/... [default if p<1] 0053 % 0054 % 'z' = Treat input power spectrum at 0Hz as an impulse rather than being diffuse 0055 % 'Z' = Treat input power spectrum at 0Hz as the sum of an impulse and a continuous component with the same amlitude as the adjacent bin 0056 % 'q' = The first output filter gives the power of the impulse at 0Hz (regardless of the 'D' option). 'zq' ensures exact retention of DC component by zi*z 0057 % 0058 % 'd' = input is power spectral density (power per Hz) instead of power 0059 % 'D' = output is power spectral density (power per Hz) instead of power 0060 % 0061 % 's' = single-sided input: do not add power from symmetric negative frequencies (i.e. non-DC/Nyquist inputs have already been doubled) 0062 % 'S' = single-sided output: include power from both positive and negative frequencies (this doubles the non-DC/Nyquist outputs) 0063 % ['w' = size(x,2)=size(xi,1)=n rather than 1+floor(n/2) although the rightmost half of x is all zeros] not yet implemented 0064 % 0065 % 'g' = plot filter coefficients as graph 0066 % 'G' = plot filter coefficients as spectrogram image [default if no output arguments present] 0067 % 0068 % Legacy options: 'y'='xX', 'Y'='x', 'yY'='X', 'u'='dD', 'U'='D' 0069 % 0070 % Outputs: x(p,k) a sparse matrix containing the v_filterbank amplitudes 0071 % If the il and ih output arguments are included then k=ih-il+1 otherwise k=1+floor(n/2) 0072 % Note that, with the 'S' option, the peak filter values equal 2 to account for the energy in the negative FFT frequencies. 0073 % cf(p) the v_filterbank centre frequencies in Hz (or in mel/erb/bark/log10 with 'H' option) 0074 % xi(k,p) [optional] sparse matrix that is an approximate inverse of x 0075 % il the lowest fft bin with a non-zero coefficient 0076 % ih the highest fft bin with a non-zero coefficient 0077 % 0078 % The input power will be preserved if the options 'xXS' are given 0079 % 0080 % The output of the routine is a sparse filterbank matrix. The vector output of the filterbank can then be obtained 0081 % by pre-multiplying an input power spectrum vector by the filterbank matrix. The input and output vectors can optionally 0082 % be in either the power domain or the power spectral density domain. 0083 % The routine implements the filterbank in two conceptual stages (which are merged in the practical implementation): 0084 % 0085 % Stage 1: 0086 % The discrete input spectrum is converted to a continuous power spectral density using linear interpolation in frequency. 0087 % Each element of the input spectrum influences a frequency interval of width 2d where d is the input frequency bin width. 0088 % The DC component of the input is treated specially in one of three ways: (a) it can be treated as a normal element that 0089 % influences an interval (-d,+d) like the other elements [default]; (b) it can be treated as an impulse at DC ['z' option]; 0090 % (c) it can be treated as a mixture of an impulse and a normal component whose value equals that of the adjacent frequency 0091 % bin ['Z' option]. 0092 % 0093 % Stage 2: 0094 % The filterbank outputs are calculated by integrating the product of the continuous spectrum and a filter weight that is 0095 % triangular in the frequency domain. Optionally, the first filterbank preserves the DC impulse component of the continuous 0096 % spectrum ['q' option]. 0097 % 0098 % References: 0099 % 0100 % [1] S. S. Stevens, J. Volkman, and E. B. Newman. A scale for the measurement 0101 % of the psychological magnitude of pitch. J. Acoust Soc Amer, 8: 185-190, 1937. 0102 % [2] S. Davis and P. Mermelstein. Comparison of parametric representations for 0103 % monosyllabic word recognition in continuously spoken sentences. 0104 % IEEE Trans Acoustics Speech and Signal Processing, 28 (4): 357-366, Aug. 1980. 0105 0106 % Bugs/Suggestions 0107 % (1) default frequencies won't work if the h option is specified 0108 % (2) low default frequency is invalid if the 'l' option is specified 0109 % (3) Add 'z' option to include a DC output as the first coefficient 0110 % (4) Add 'Z' option to ignore the DC input component 0111 % (5) Add 'i' option to calculate the inverse of x instead 0112 % (6) Add option to choose the domain in which linear interpolation is performed 0113 0114 % Copyright (C) Mike Brookes 1997-2024 0115 % Version: $Id: v_filtbankm.m $ 0116 % 0117 % VOICEBOX is a MATLAB toolbox for speech processing. 0118 % Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html 0119 % 0120 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0121 % This program is free software; you can redistribute it and/or modify 0122 % it under the terms of the GNU General Public License as published by 0123 % the Free Software Foundation; either version 2 of the License, or 0124 % (at your option) any later version. 0125 % 0126 % This program is distributed in the hope that it will be useful, 0127 % but WITHOUT ANY WARRANTY; without even the implied warranty of 0128 % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 0129 % GNU General Public License for more details. 0130 % 0131 % You can obtain a copy of the GNU General Public License from 0132 % http://www.gnu.org/copyleft/gpl.html or by writing to 0133 % Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA. 0134 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0135 0136 % Notes: 0137 % (1) In the comments, "FFT bin_0" assumes DC = bin 0 whereas "FFT bin_1" means DC = bin 1nfout 0138 % (2) "input" and "output" need to be interchanged if the 'i' option is given 0139 0140 if nargin<6 || isempty(w) % if no mode option, w, is specified 0141 w='f'; % default mode option: 'f' = linear output frequency scale 0142 end 0143 wr=max(any(repmat('lebm',length(w),1)==repmat(w',1,4),1).*(1:4)); % output warping: 0=linear,1=log,2=erbrate,3=bark,4=mel 0144 ww=any(repmat('ncChHxXyYpPzZqdDuUsSgG',length(w),1)==repmat(w',1,22),1); % decode all other options 0145 % ww elements: 1=n,2=c,3=C,4=h,5=H,6=x,7=X,8=y,9=Y,10=p,11=P,12=z,13=Z,14=q,15=d,16=D,17=u,18=U,19=s,20=S,21=g,22=G 0146 % Convert legacy option codes: 'y'='xX', 'Y'='x', 'yY'='X', 'u'='dD', 'U'='D' 0147 ww(6)=ww(8) ~= ww(9); % convert 'y' or 'Y' (but not both) to 'x'; extend low frequencies 0148 ww(7)=ww(8); % convert 'y' to 'X'; extend high frequencies 0149 ww(15)=ww(17); % convert 'u' to 'd'; input is also in power spectral density 0150 ww(16)=ww(17) || ww(18); % convert 'u' or 'U' to 'd'; output is in power spectral density 0151 flhconv=repmat(wr>0 && ~ww(4),1,2); % flag indicating need to convert filterbank limits from Hz to mel/erb/bark/log10 0152 if nargin < 4 || isempty(fl) 0153 fl=30*(wr==1); % lower limit is 0 Hz unless 'l' option specified, in which case it is 30 Hz 0154 flhconv(1)=wr>0; 0155 end 0156 if nargin < 5 || isempty(fh) 0157 fh=0.5*fs; % max freq is the nyquist frequency 0158 flhconv(2)=wr>0; 0159 end 0160 % 0161 % Sort out input frequency bins 0162 % 0163 if numel(n)>1 0164 error('non-standard input frequency spacing no longer supported'); 0165 else % n gives dft length 0166 nf=1+floor(n/2); % number of input positive-frequency bins from DFT 0167 df=fs/n; % input frequency bin spacing 0168 end 0169 % 0170 % Sort out output frequency bins 0171 % 0172 mflh=[fl fh]; % low and high limits of filterbank triangular filters 0173 if any(flhconv) % convert mflh from Hz to mel/erb/... unless already converted via 'h' option 0174 switch wr 0175 case 1 % 'l' = log scaled 0176 if fl<=0 0177 error('Low frequency limit must be >0 for ''l'' log10-frequency option'); 0178 end 0179 mflh(flhconv)=log10(mflh(flhconv)); % convert frequency limits into log10 Hz 0180 case 2 % 'e' = erb-rate scaled 0181 mflh(flhconv)=v_frq2erb(mflh(flhconv)); % convert frequency limits into erb-rate 0182 case 3 % 'b' = bark scaled 0183 mflh(flhconv)=v_frq2bark(mflh(flhconv)); % convert frequency limits into bark 0184 case 4 % 'm' = mel scaled 0185 mflh(flhconv)=v_frq2mel(mflh(flhconv)); % convert frequency limits into mel 0186 end 0187 end 0188 melrng=mflh(2)-mflh(1); % filterbank range in Hz/mel/erb/... 0189 if isempty(p) 0190 p=ceil(4.6*log10(2*(nf-1)*df)); % default number of output filters 0191 end 0192 puc=ww(11) || (p<1) && ~ww(10); % input p specifies the filter spacing rather than the number of filters 0193 if puc 0194 p=round(melrng/(p*1000))+ww(2)+ww(3)-1+ww(14); % p now gives the number of filters (excluding DC impulse) 0195 end 0196 melinc=melrng/(p+ww(2)+ww(3)+1-ww(14)); % inter-filter increment in mel 0197 mflh=mflh+[-ww(2) ww(3)]*melinc; % update mflh to include the full width of all filters 0198 % 0199 % Calculate the output centre frequencies in Hz including dummy end points 0200 % 0201 pmq=p-ww(14); % number of filters excluding the one for the DC impulse 0202 cf=mflh(1)+(0:pmq+1)*melinc; % centre frequencies in mel/erb/... including dummy ends 0203 cf(2:end)=max(cf(2:end),0); % only the first point can be negative [*** doesn't make sense for log scale ***] 0204 switch wr % convert centre frequencies to Hz from mel/erb/... 0205 case 1 % 'l' = log scaled 0206 mb=10.^(cf); 0207 case 2 % 'e' = erb-rate scaled 0208 mb=v_erb2frq(cf); 0209 case 3 % 'b' = bark scaled 0210 mb=v_bark2frq(cf); 0211 case 4 % 'm' = mel scaled 0212 mb=v_mel2frq(cf); 0213 otherwise % [default] = linear scaled; no conversion needed 0214 mb=cf; 0215 end 0216 % 0217 % sort out 2-sided input frequencies 0218 % 0219 fin=(-nf:nf)*df; % reflect negative frequencies excluding DC 0220 nfin=length(fin); % length of extended input frequency list [nfin=2*nf+1] 0221 % 0222 % now sort out the list of output frequencies 0223 % 0224 fout=mb; % output centre frequencies in Hz including dummy values at each end 0225 lowex=ww(6); % extend to 0 Hz if 'x' given 0226 highex=ww(7) && (fout(end-1)<fin(end)); % extend at high end if 'X' specified and final centre frequency < Nyquist 0227 if lowex % extend first filter at low end to DC 0228 fout=[0 0 fout(2:end)]; % ... add two dummy values at DC instead of previous single dummy value 0229 end 0230 if highex % extend last filter at high end to Nyquist 0231 fout=[fout(1:end-1) fin(end) fin(end)]; % ... add two dummy values at Nyquist instead of previous single dummy value 0232 end 0233 fout=min(fout,fs/2); % limit output filters to Nyquist frequency 0234 nfout=length(fout); % number of output filters including one or two dummy points at each end 0235 foutin=[fout fin]; 0236 nfall=length(foutin); % = nfout + nfin 0237 wleft=[0 fout(2:nfout)-fout(1:nfout-1) 0 fin(2:nfin)-fin(1:nfin-1)]; % width of lower triangle attached to each node 0238 wright=[wleft(2:end) 0]; % width of upper triangle attached to each node 0239 ffact=[0 ones(1,nfout-2) 0 0 ones(1,2*nf-1) 0]; % valid triangle posts 0240 ffact(wleft+wright==0)=0; % disable null width triangles (*** probably unnecessary if all frequencies are distinct ***) 0241 [fall,ifall]=sort(foutin); % fall is sorted frequencies with fall=foutin(ifall) 0242 jfall=zeros(1,nfall); % create inverse index ... 0243 infall=1:nfall; % ... 0244 jfall(ifall)=infall; % ... inverse-index satisfying foutin=fall(jfall) 0245 ffact(ifall([1:max(jfall(1),jfall(nfout+1))-2 min(jfall(nfout),jfall(nfall))+2:nfall]))=0; % zap input nodes that lie outside the output filters 0246 nxto=cumsum(ifall<=nfout); % next output node to the right (or equal) to each node 0247 nxti=cumsum(ifall>nfout); % number of input nodes to the left (or equal) to each node 0248 nxtr=min(nxti+1+nfout,nfall); % next input node to the right of each value (or nfall if none) 0249 nxtr(ifall>nfout)=1+nxto(ifall>nfout); % next post to the right of opposite input/output type (using sorted indexes) 0250 nxtr=nxtr(jfall); % next post to the right of opposite input/output type (converted to unsorted indices) or if none: nfall or (nfout+1) 0251 % 0252 % The interpolated spectrum at any frequency can be expressed as the sum of the values at the adjacent input bins 0253 % multiplied by triangular weights that decreases from 1 to 0 between the two bins. The value at an output bin 0254 % is equal to the integral of the interpolated spectrum multiplied by a triangular weight that decreases from 0255 % 1 to 0 either side of the output bin. Thus, if all input/output bins are sorted into ascending order, the 0256 % interval between two adjacent bins contains four partial triangles (a.k.a. trapeziums): two "lower" triangles 0257 % that increase with frequency and two "upper" triangles that decrease with frequency. We need to integrate the 0258 % resultant four input-output trapezium products and add the integrals onto the sum for the appropriate output bins. 0259 % Each triangle has a "post" at one end and is zero at the other end; we enumerate the triangle pairs by pairing 0260 % all input and output triangles with the first available triangle of the other type (i.e. output or input) whose 0261 % rightmost node is to the right of the entire first triangle. 0262 % 0263 % The general result for integrating the product of two trapesiums with 0264 % heights (a,b) and (c,d) over a width x is (ad+bc+2bd+2ac)*x/6 0265 % 0266 % integrate product of lower triangles whose posts (and rightmost nodes) are ix1 and jx1 0267 % 0268 msk0=(ffact>0); % posts with a non-zero magnitude 0269 msk=msk0 & (ffact(nxtr)>0); % select triangle pairs with both posts having non-zero magnitudes 0270 ix1=infall(msk); % unsorted indices of leftmost post of pair 0271 jx1=nxtr(msk); % unsorted indices of rightmost post of pair 0272 vfgx=foutin(ix1)-foutin(jx1-1); % portion of triangle attached to rightmost post that lies to the left of the leftmost post 0273 yx=min(wleft(ix1),vfgx); % integration length. Maybe more efficient: dfall=diff(fall); yx=dfall(jfall(ix1)-1) 0274 wx1=ffact(ix1).*ffact(jx1).*yx.*(wleft(ix1).*vfgx-yx.*(0.5*(wleft(ix1)+vfgx)-yx/3))./(wleft(ix1).*wleft(jx1)+(yx==0)); 0275 0276 % integrate product of upper triangles whose posts are ix2 and jx2 and whose rightmost nodes are ix2+1 and jx2+1 0277 0278 nxtu=max([nxtr(2:end)-1 0],1); % post of the upper triangle of opposite type whose rightmost end is to the right of this triangle's rightmost end 0279 msk=msk0 & (ffact(nxtu)>0); % select triangle pairs with both posts having non-zero magnitudes 0280 ix2=infall(msk); % unsorted indices of leftmost post of pair 0281 jx2=nxtu(msk); % unsorted indices of rightmost post of pair 0282 vfgx=foutin(ix2+1)-foutin(jx2); % length of left triangle to the right of the right post 0283 yx=min(wright(ix2),vfgx); % integration length 0284 yx(foutin(jx2+1)<foutin(ix2+1))=0; % zap invalid triangles where the rightmost ends are in the wrong order 0285 wx2=ffact(ix2).*ffact(jx2).*yx.^2.*((0.5*(wright(jx2)-vfgx)+yx/3))./(wright(ix2).*wright(jx2)+(yx==0)); 0286 0287 % integrate lower triangle and upper triangle that ends to its right 0288 0289 nxtu=max(nxtr-1,1); % post of the upper triangle of opposite type whose rightmost end is to the right of this triangle's post 0290 msk=msk0 & (ffact(nxtu)>0); % select triangle pairs with both posts having non-zero magnitudes 0291 ix3=infall(msk); % unsorted indices of lower triangle 0292 jx3=nxtu(msk); % unsorted indices of upper triangle 0293 vfgx=foutin(ix3)-foutin(jx3); % length of upper triangle to the left of the lower post 0294 yx=min(wleft(ix3),vfgx); % integration length 0295 yx(foutin(jx3+1)<foutin(ix3))=0; % zap invalid triangles where the rightmost ends are in the wrong order 0296 wx3=ffact(ix3).*ffact(jx3).*yx.*(wleft(ix3).*(wright(jx3)-vfgx)+yx.*(0.5*(wleft(ix3)-wright(jx3)+vfgx)-yx/3))./(wleft(ix3).*wright(jx3)+(yx==0)); 0297 0298 % integrate upper triangle and lower triangle that starts to its right 0299 0300 nxtu=[nxtr(2:end) 1]; 0301 msk=msk0 & (ffact(nxtu)>0); % select triangle pairs with both posts having non-zero magnitudes 0302 ix4=infall(msk); % unsorted indices of upper triangle 0303 jx4=nxtu(msk); % unsorted indices of lower triangle 0304 vfgx=foutin(ix4+1)-foutin(jx4-1); % length of upper triangle to the left of the lower post 0305 yx=min(wright(ix4),vfgx); % integration length 0306 wx4=ffact(ix4).*ffact(jx4).*yx.^2.*(0.5*vfgx-yx/3)./(wright(ix4).*wleft(jx4)+(yx==0)); 0307 % 0308 % now assemble the matrix 0309 % 0310 iox=sort([ix1 ix2 ix3 ix4;jx1 jx2 jx3 jx4]); % iox(1,:) are output posts, iox(2,:) are input posts 0311 msk=iox(2,:)<=(nfall+nfout)/2; % find references to negative input frequencies 0312 iox(2,msk)=(nfall+nfout+1)-iox(2,msk); % convert negative frequencies to positive 0313 % 0314 % deal with impluses at DC 0315 % 0316 % if ww(13) % 'Z': DC input is an impulse plus a diffuse component 0317 % iox(2,iox(2,:)==nfout+nf+1)=nfout+nf+2; % change references to the DC input to the adjacent bin 0318 % end 0319 % 0320 % Sort out output gains: 0321 % If output is power then output gain is 1; if output is power/Hz then output gain is 1/area of output filter 0322 % 0323 if lowex % if lowest filter extended to DC, we added a dummy point at 0Hz, so 0324 iox(1,iox(1,:)==2)=3; % merge lowest two output nodes 0325 end 0326 if highex % if highest filter extended, we added a dummy point at Nyquist, so 0327 iox(1,iox(1,:)==nfout-1)=nfout-2; % merge highest two output nodes 0328 end 0329 x=sparse(iox(1,:)-1-lowex,max(iox(2,:)-nfout-nf,1),[wx1 wx2 wx3 wx4],pmq,nf); % forward transformation matrix without input/output gains 0330 gout=full(sum(x,2)); % area of each output integral 0331 goutd=sparse(1:pmq,1:pmq,(gout+(gout==0)).^(-1)); % create sparse diagonal matrix of output gains 0332 % 0333 % Sort out input gains: 0334 % If input is power then input gain is 1/area; if input is power/Hz then input gain is 1 0335 % 0336 gin=fin(3:nfin)-fin(1:nfin-2); % full width of input interpolation filters 0337 gin=2*(gin+(gin==0)).^(-1); % input gain equals 1/area 0338 ginsi=repmat(1+ww(19),1,nf-2); % 's' option means all inputs except DC and Nyquist have been doubled 0339 ginsd=sparse(1:nf,1:nf,[1-ww(12) ginsi.^(-1) 1]); % ... cancel this out with additional input scaling for forward transform 0340 ginsid=sparse(1:nf,1:nf,[2*(1-ww(12)) ginsi 2]); % and back again for inverse transform 0341 gind=sparse(1:nf,1:nf,gin(end-nf+1:end)); % input gains 0342 % 0343 % Now create the x and xi matrices 0344 % 0345 switch 2*ww(16)+ww(15) 0346 case 0 % '': input and output are both power 0347 xi=ginsid*x'*goutd; 0348 x=x*(gind*ginsd); 0349 case 1 % 'd': input is power/Hz, output is power 0350 xi=(ginsid*gind)*x'*goutd; 0351 x=x*ginsd; 0352 case 2 % 'D': input is power, output is power/Hz 0353 xi=ginsid*x'; 0354 x=goutd*x*(gind*ginsd); 0355 case 3 % 'dD': input and output are both power/Hz 0356 xi=(ginsid*gind)*x'; 0357 x=goutd*x*ginsd; 0358 end 0359 if ww(20) % 'S': double outputs to include negative frequency energy 0360 x=2*x; 0361 xi=0.5*xi; 0362 end 0363 if ww(13) % 'Z': DC input is an impulse plus a diffuse component 0364 x(:,2)=x(:,2)+x(:,1)*ginsd(2,2); % power of diffuse component at DC is equal to that opf adjacent bin corrected for 's' option 0365 x(:,1)=0; % Eliminate references to DC input in forward transform only 0366 end 0367 if ww(14) % 'q': we need an extra output that replicates the DC component 0368 if ww(12) % 'z': DC input is an impulse 0369 x=[sparse(1,1,1,1,nf); x]; 0370 xi=[sparse(1,1,1,nf,1) xi]; 0371 elseif ww(13) % 'Z': DC input is an impulse plus a diffuse component 0372 x=[sparse([1 1],[1 2],[1 -ginsd(2,2)],1,nf); x]; % impulse component is DC input minus adjacent bin corrected for 's' option 0373 xi=[sparse(1,1,1,nf,1) xi]; 0374 else 0375 x=[sparse(1,nf); x]; % '': DC input is diffuse as normal 0376 xi=[sparse(nf,1) xi]; 0377 end 0378 end 0379 % 0380 % sort out the output argument options 0381 % 0382 if ~ww(5) % output cf in Hz instead of mel/erb/... 0383 cf=[zeros(1,ww(14)) mb(2:pmq+1)]; % ... and include an initial 0 if 'q' option (ww(14)==1) 0384 else % keep cf in mel/erb/... 0385 if ww(14) % 'q' (ww(14)==1): we need an extra output for the DC component 0386 if wr==1 % log-scaled so ... 0387 cf=[-Inf cf(2:p)]; % ... DC corresponds to -Inf 0388 else % not log-scaled ... 0389 cf=[0 cf(2:p)]; % ... DC corresponds to 0 0390 end 0391 else % no 'q' option (ww(14)==0) ... 0392 cf=cf(2:p+1); % ... just remove dummy end frequencies 0393 end 0394 end 0395 il=1; 0396 ih=nf; 0397 if ww(1) % round outputs to the centre of gravity bin 0398 sx2=sum(x,2); % sum of each row 0399 msk=full(sx2~=0); 0400 vxc=zeros(pmq,1); 0401 vxc(msk)=round((x(msk,:)*(1:nf)')./sx2(msk)); % find centre of gravity of each row 0402 x=sparse(1:pmq,vxc,sx2,pmq,nf); % put all the weight into the centre of gravity bin 0403 end 0404 if nargout > 3 % if il and/or ih output arguments are specified ... 0405 msk=full(any(x>0,1)); % find non-zero columns 0406 il=find(msk,1); % il is first non-zero column 0407 if ~numel(il) % if x is all zeros ... 0408 ih=1; % ... set ih to 1 ******** should we also set il to one? 0409 elseif nargout >3 0410 ih=find(msk,1,'last'); % ih is last non-zero column 0411 end 0412 x=x(:,il:ih); % remove redundant columns from x 0413 if nargout==4 % xi has been omitted ... 0414 xi=il; % shift the il and ih outputs up by one position 0415 il=ih; 0416 else 0417 xi=x(il:ih,:); % remove redundant rows from xi 0418 end 0419 end 0420 % This bit commented out because it should be redundant 0421 % if ww(15) && ww(16) % if input and output are both in power per Hz 0422 % sx=sum(x,2); 0423 % x=x./repmat(sx+(sx==0),1,size(x,2)); % force rows to sum to 1 (should be true anyway) 0424 % end 0425 % 0426 % plot results if no output arguments or 'g','G' options given 0427 % 0428 if ~nargout || ww(21) || ww(22) % plot idealized filters 0429 ww(22)=~ww(21); % 'G' option is the default unless 'g' is specified 0430 finax=(il-1:ih-1)*df; % input frequency axis 0431 newfig=0; 0432 if ww(21) 0433 plot(finax,x'); 0434 hold on 0435 plot(finax,sum(x,1),'--k'); 0436 v_axisenlarge([-1 -1.05]); 0437 plot(repmat(mb(2:end-1),2,1),get(gca,'ylim'),':k'); 0438 hold off 0439 title(['filtbankm: mode = ''' w '''']); 0440 xlabel(['Frequency (' v_xticksi 'Hz)']); 0441 ylabel('Weight'); 0442 newfig=1; 0443 end 0444 if ww(22) 0445 if newfig 0446 figure; 0447 end 0448 imagesc(finax,1:pmq,x); 0449 axis 'xy' 0450 colorbar; 0451 hold on 0452 ylim=get(gca,'ylim'); 0453 plot(repmat(mb(2:end-1),2,1),ylim,':w'); 0454 hold off 0455 v_cblabel('Weight'); 0456 switch wr 0457 case 1 0458 type='Log-spaced'; 0459 case 2 0460 type='Erb-spaced'; 0461 case 3 0462 type='Bark-spaced'; 0463 case 4 0464 type='Mel-spaced'; 0465 otherwise 0466 type='Linear-spaced'; 0467 end 0468 ylabel([type ' Filter']); 0469 xlabel(['Frequency (' v_xticksi 'Hz)']); 0470 title(['filtbankm: mode = ''' w '''']); 0471 end 0472 0473 end