V_SSUBMMSE performs speech enhancement using mmse estimate of spectral amplitude or log amplitude [SS,ZO]=(S,FSZ,P) Usage: y=v_ssubmmsev(x,fs); % enhance the speech using default parameters Inputs: si input speech signal fsz sample frequency in Hz Alternatively, the input state from a previous call (see below) pp algorithm parameters [optional] Outputs: ss output enhanced speech gg(t,f,i) selected time-frequency values (see pp.tf below) tt centre of frames (in seconds) ff centre of frequency bins (in Hz) zo output state (or the 2nd argument if gg,tt,ff are omitted) The algorithm operation is controlled by a small number of parameters: pp.of % overlap factor = (fft length)/(frame increment) [2] pp.ti % desired frame increment [0.016 seconds] pp.ri % set to 1 to round ti to the nearest power of 2 samples [0] pp.ta % time const for smoothing SNR estimate [0.396 seconds] pp.gx % maximum posterior SNR as a power ratio [1000 = +30dB] pp.gn % min posterior SNR as a power ratio when estimating prior SNR [1 = 0dB] pp.gz % min posterior SNR as a power ratio [0.001 = -30dB] pp.xn % minimum prior SNR [0] pp.xb % bias compensation factor for prior SNR [1] pp.lg % MMSE target: 0=amplitude, 1=log amplitude, 2=perceptual Bayes [1] pp.tn; % smoothing time constant for noise estimation [0.5 s] pp.le; % VAD threshold: log(p/(1-p)) where p is speech prob in a freq bin; use -Inf to prevent updating [0.15 (=>p=0.54)] pp.tx; % initial noise interval [0.06 s] pp.ne % noise estimation: 0=min statistics, 1=MMSE [0] pp.bt % threshold for binary gain or -1 for continuous gain [-1] pp.mx % input mixture gain [0] pp.rf % round output signal to an exact number of frames [0] pp.tf % selects time-frequency planes to output in the gg() variable ['g'] 'i' = input power spectrum 'I' = input complex spectrum 'n' = noise power spectrum 'z' = "posterior" SNR (i.e. (S+N)/N ) 'x' = "prior" SNR (i.e. S/N ) 'g' = gain 'o' = output power spectrum 'O' = output complex spectrum The applied gain is mx+(1-mx)*optgain where optgain is calculated according to [1] or [2]. If pp.bt>=0 then optgain is first thresholded with pp.bt to produce a binary gain 0 or 1. The default parameters implement the original algorithm in [1,2]. Several parameters relate to the estimation of xi, the so-called "prior SNR", xi=max(a*pp.xb*xu+(1-a)*max(gami-1,pp.gn-1),pp.xn); This is estimated as a smoothed version of 1 less than gami, the "posterior SNR" which is the noisy speech power divided by the noise power. This is clipped to a min of (pp.gn-1), smoothed using a factor "a" which corresponds to a time-constant of pp.ta and then clipped to a minimum of pp.xn. The previous value is taken to be pp.xb*xu where xu is the ratio of the estimated speech amplitude squared to the noise power. The noise estimation uses a VAD from equation (4) in [6] and recursively updates the noise spectrum only in frames that are classified as noise-only. If convenient, you can call v_specsub in chunks of arbitrary size. Thus the following are equivalent: (a) y=v_ssubmmse(s,fs); (b) [y1,z]=v_ssubmmse(s(1:1000),fs); [y2,z]=v_ssubmmse(s(1001:2000),z); y3=v_ssubmmse(s(2001:end),z); y=[y1; y2; y3]; If the number of output arguments is either 2 or 5, the last partial frame of samples will be retained for overlap adding with the output from the next call to v_ssubmmse(). See also v_specsub() for an alternative gain function Refs: [1] Ephraim, Y. & Malah, D. Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator IEEE Trans Acoustics Speech and Signal Processing, 32(6):1109-1121, Dec 1984 [2] Ephraim, Y. & Malah, D. Speech enhancement using a minimum mean-square error log-spectral amplitude estimator IEEE Trans Acoustics Speech and Signal Processing, 33(2):443-445, Apr 1985 [3] Rainer Martin. Noise power spectral density estimation based on optimal smoothing and minimum statistics. IEEE Trans. Speech and Audio Processing, 9(5):504-512, July 2001. [4] O. Cappe. Elimination of the musical noise phenomenon with the ephraim and malah noise suppressor. IEEE Trans Speech Audio Processing, 2 (2): 345-349, Apr. 1994. doi: 10.1109/89.279283. [5] J. Erkelens, J. Jensen, and R. Heusdens. A data-driven approach to optimizing spectral speech enhancement methods for various error criteria. Speech Communication, 49: 530-541, 2007. doi: 10.1016/j.specom.2006.06.012. [6] J. Sohn, N. S. Kim, and W. Sung. A statistical model-based voice activity detection. IEEE Signal Processing Lett., 6 (1): 1-3, 1999. doi: 10.1109/97.736233. [7] Loizou, P. Speech enhancement based on perceptually motivated Bayesian estimators of the speech magnitude spectrum. IEEE Trans. Speech and Audio Processing, 13(5), 857-869, 2005.
0001 function [ss,gg,tt,ff,zo]=v_ssubmmsev(si,fsz,pp) 0002 %V_SSUBMMSE performs speech enhancement using mmse estimate of spectral amplitude or log amplitude [SS,ZO]=(S,FSZ,P) 0003 % 0004 % Usage: y=v_ssubmmsev(x,fs); % enhance the speech using default parameters 0005 % 0006 % Inputs: 0007 % si input speech signal 0008 % fsz sample frequency in Hz 0009 % Alternatively, the input state from a previous call (see below) 0010 % pp algorithm parameters [optional] 0011 % 0012 % Outputs: 0013 % ss output enhanced speech 0014 % gg(t,f,i) selected time-frequency values (see pp.tf below) 0015 % tt centre of frames (in seconds) 0016 % ff centre of frequency bins (in Hz) 0017 % zo output state (or the 2nd argument if gg,tt,ff are omitted) 0018 % 0019 % The algorithm operation is controlled by a small number of parameters: 0020 % 0021 % pp.of % overlap factor = (fft length)/(frame increment) [2] 0022 % pp.ti % desired frame increment [0.016 seconds] 0023 % pp.ri % set to 1 to round ti to the nearest power of 2 samples [0] 0024 % pp.ta % time const for smoothing SNR estimate [0.396 seconds] 0025 % pp.gx % maximum posterior SNR as a power ratio [1000 = +30dB] 0026 % pp.gn % min posterior SNR as a power ratio when estimating prior SNR [1 = 0dB] 0027 % pp.gz % min posterior SNR as a power ratio [0.001 = -30dB] 0028 % pp.xn % minimum prior SNR [0] 0029 % pp.xb % bias compensation factor for prior SNR [1] 0030 % pp.lg % MMSE target: 0=amplitude, 1=log amplitude, 2=perceptual Bayes [1] 0031 % pp.tn; % smoothing time constant for noise estimation [0.5 s] 0032 % pp.le; % VAD threshold: log(p/(1-p)) where p is speech prob in a freq bin; use -Inf to prevent updating [0.15 (=>p=0.54)] 0033 % pp.tx; % initial noise interval [0.06 s] 0034 % pp.ne % noise estimation: 0=min statistics, 1=MMSE [0] 0035 % pp.bt % threshold for binary gain or -1 for continuous gain [-1] 0036 % pp.mx % input mixture gain [0] 0037 % pp.rf % round output signal to an exact number of frames [0] 0038 % pp.tf % selects time-frequency planes to output in the gg() variable ['g'] 0039 % 'i' = input power spectrum 0040 % 'I' = input complex spectrum 0041 % 'n' = noise power spectrum 0042 % 'z' = "posterior" SNR (i.e. (S+N)/N ) 0043 % 'x' = "prior" SNR (i.e. S/N ) 0044 % 'g' = gain 0045 % 'o' = output power spectrum 0046 % 'O' = output complex spectrum 0047 % 0048 % The applied gain is mx+(1-mx)*optgain where optgain is calculated according to [1] or [2]. 0049 % If pp.bt>=0 then optgain is first thresholded with pp.bt to produce a binary gain 0 or 1. 0050 % 0051 % The default parameters implement the original algorithm in [1,2]. 0052 % 0053 % Several parameters relate to the estimation of xi, the so-called "prior SNR", 0054 % 0055 % xi=max(a*pp.xb*xu+(1-a)*max(gami-1,pp.gn-1),pp.xn); 0056 % 0057 % This is estimated as a smoothed version of 1 less than gami, the "posterior SNR" 0058 % which is the noisy speech power divided by the noise power. This is 0059 % clipped to a min of (pp.gn-1), smoothed using a factor "a" which corresponds to a 0060 % time-constant of pp.ta and then clipped to a minimum of pp.xn. The 0061 % previous value is taken to be pp.xb*xu where xu is the ratio of the 0062 % estimated speech amplitude squared to the noise power. 0063 % 0064 % The noise estimation uses a VAD from equation (4) in [6] and recursively updates 0065 % the noise spectrum only in frames that are classified as noise-only. 0066 % 0067 % If convenient, you can call v_specsub in chunks of arbitrary size. Thus the following are equivalent: 0068 % 0069 % (a) y=v_ssubmmse(s,fs); 0070 % 0071 % (b) [y1,z]=v_ssubmmse(s(1:1000),fs); 0072 % [y2,z]=v_ssubmmse(s(1001:2000),z); 0073 % y3=v_ssubmmse(s(2001:end),z); 0074 % y=[y1; y2; y3]; 0075 % 0076 % If the number of output arguments is either 2 or 5, the last partial frame of samples will 0077 % be retained for overlap adding with the output from the next call to v_ssubmmse(). 0078 % 0079 % See also v_specsub() for an alternative gain function 0080 % 0081 % Refs: 0082 % [1] Ephraim, Y. & Malah, D. 0083 % Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator 0084 % IEEE Trans Acoustics Speech and Signal Processing, 32(6):1109-1121, Dec 1984 0085 % [2] Ephraim, Y. & Malah, D. 0086 % Speech enhancement using a minimum mean-square error log-spectral amplitude estimator 0087 % IEEE Trans Acoustics Speech and Signal Processing, 33(2):443-445, Apr 1985 0088 % [3] Rainer Martin. 0089 % Noise power spectral density estimation based on optimal smoothing and minimum statistics. 0090 % IEEE Trans. Speech and Audio Processing, 9(5):504-512, July 2001. 0091 % [4] O. Cappe. 0092 % Elimination of the musical noise phenomenon with the ephraim and malah noise suppressor. 0093 % IEEE Trans Speech Audio Processing, 2 (2): 345-349, Apr. 1994. doi: 10.1109/89.279283. 0094 % [5] J. Erkelens, J. Jensen, and R. Heusdens. 0095 % A data-driven approach to optimizing spectral speech enhancement methods for various error criteria. 0096 % Speech Communication, 49: 530-541, 2007. doi: 10.1016/j.specom.2006.06.012. 0097 % [6] J. Sohn, N. S. Kim, and W. Sung. 0098 % A statistical model-based voice activity detection. 0099 % IEEE Signal Processing Lett., 6 (1): 1-3, 1999. doi: 10.1109/97.736233. 0100 % [7] Loizou, P. 0101 % Speech enhancement based on perceptually motivated Bayesian estimators of the speech magnitude spectrum. 0102 % IEEE Trans. Speech and Audio Processing, 13(5), 857-869, 2005. 0103 0104 % Bugs/suggestions: 0105 % (1) sort out behaviour when si() is a matrix rather than a vector 0106 % 0107 % Copyright (C) Mike Brookes 2004-2011 0108 % Version: $Id: v_ssubmmsev.m 10865 2018-09-21 17:22:45Z dmb $ 0109 % 0110 % VOICEBOX is a MATLAB toolbox for speech processing. 0111 % Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html 0112 % 0113 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0114 % This program is free software; you can redistribute it and/or modify 0115 % it under the terms of the GNU General Public License as published by 0116 % the Free Software Foundation; either version 2 of the License, or 0117 % (at your option) any later version. 0118 % 0119 % This program is distributed in the hope that it will be useful, 0120 % but WITHOUT ANY WARRANTY; without even the implied warranty of 0121 % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 0122 % GNU General Public License for more details. 0123 % 0124 % You can obtain a copy of the GNU General Public License from 0125 % http://www.gnu.org/copyleft/gpl.html or by writing to 0126 % Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA. 0127 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0128 persistent kk cc 0129 if ~numel(kk) 0130 kk=sqrt(2*pi); % sqrt(8)*Gamma(1.5) - required constant 0131 cc=sqrt(2/pi); %sqrt(2)/Gamma(0.5) 0132 end 0133 if numel(si)>length(si) 0134 error('Input speech signal must be a vector not a matrix'); 0135 end 0136 if isstruct(fsz) 0137 fs=fsz.fs; 0138 qq=fsz.qq; 0139 qp=fsz.qp; 0140 ze=fsz.ze; 0141 s=zeros(length(fsz.si)+length(si(:)),1); % allocate space for speech 0142 s(1:length(fsz.si))=fsz.si; 0143 s(length(fsz.si)+1:end)=si(:); 0144 else 0145 fs=fsz; % sample frequency 0146 s=si(:); 0147 % default algorithm constants 0148 0149 qq.of=2; % overlap factor = (fft length)/(frame increment) 0150 qq.ti=16e-3; % desired frame increment (16 ms) 0151 qq.ri=0; % round ni to the nearest power of 2 0152 qq.ta=0.396; % Time const for smoothing SNR estimate = -tinc/log(0.98) from [1] 0153 qq.gx=1000; % maximum posterior SNR = 30dB 0154 qq.gn=1; % min posterior SNR as a power ratio when estimating prior SNR [1] 0155 qq.gz=0.001; % min posterior SNR as a power ratio [0.001 = -30dB] 0156 qq.xn=0; % minimum prior SNR = -Inf dB 0157 qq.xb=1; % bias compensation factor for prior SNR [1] 0158 qq.lg=1; % use log-domain estimator by default 0159 qq.ne=0; % noise estimation: 0=min statistics, 1=MMSE [0] 0160 qq.bt=-1; % suppress binary masking 0161 qq.mx=0; % no input mixing 0162 qq.tf='g'; % output the gain time-frequency plane by default 0163 qq.rf=0; 0164 qq.tn=0.5; % smoothing constant for noise estimation [500 ms] 0165 qq.le=0.15; % VAD threshold; use -Inf to prevent updating 0166 qq.tx=0.06; % initial noise interval [60 ms] 0167 if nargin>=3 && ~isempty(pp) 0168 qp=pp; % save for v_estnoisem call 0169 qqn=fieldnames(qq); 0170 for i=1:length(qqn) 0171 if isfield(pp,qqn{i}) 0172 qq.(qqn{i})=pp.(qqn{i}); 0173 end 0174 end 0175 else 0176 qp=struct; % make an empty structure 0177 end 0178 end 0179 % derived algorithm constants 0180 if qq.ri 0181 ni=pow2(nextpow2(ti*fs*sqrt(0.5))); 0182 else 0183 ni=round(qq.ti*fs); % frame increment in samples 0184 end 0185 tinc=ni/fs; % true frame increment time 0186 a=exp(-tinc/qq.ta); % SNR smoothing coefficient 0187 gx=qq.gx; % max posterior SNR as a power ratio 0188 gz=qq.gz; % min posterior SNR as a power ratio 0189 xn=qq.xn; % floor for prior SNR, xi 0190 ne=qq.ne; % noise estimation: 0=min statistics, 1=MMSE [0] 0191 gn1=max(qq.gn-1,0); % floor for posterior SNR when estimating prior SNR 0192 le=qq.le; 0193 xb=qq.xb; 0194 tf=qq.tf; 0195 rf=qq.rf || nargout==2 || nargout==5; % round down to an exact number of frames 0196 nd=max(1,round(qq.tx/tinc)); % number of frames for initial noise estimate 0197 an=exp(-tinc/qq.tn); % Noise spectrum smoothing coefficient 0198 0199 % calculate power spectrum in frames 0200 0201 no=round(qq.of); % integer overlap factor 0202 nf=ni*no; % fft length 0203 w=sqrt(hamming(nf+1))'; w(end)=[]; % for now always use sqrt hamming window 0204 w=w/sqrt(sum(w(1:ni:nf).^2)); % normalize to give overall gain of 1 0205 if rf>0 0206 rfm=''; % truncated input to an exact number of frames 0207 else 0208 rfm='r'; 0209 end 0210 [y,tt]=v_enframe(s,w,ni,rfm); 0211 tt=tt/fs; % frame times 0212 yf=v_rfft(y,nf,2); 0213 yp=yf.*conj(yf); % power spectrum of input speech 0214 [nr,nf2]=size(yp); % number of frames 0215 ff=(0:nf2-1)*fs/nf; 0216 if isstruct(fsz) 0217 ndp=fsz.ndp; 0218 dpi=fsz.dpi; 0219 ssv=fsz.ssv; 0220 xu=fsz.xu; % saved unsmoothed SNR 0221 else 0222 dpi=zeros(1,nf2); % noise estimate 0223 ndp=0; % noise estimate based on ndp frames 0224 ssv=zeros(ni*(no-1),1); % dummy saved overlap 0225 xu=1; % dummy unsmoothed SNR from previous frame 0226 end 0227 if ~nr % no data frames 0228 ss=[]; 0229 gg=[]; 0230 else 0231 if ndp<nd 0232 ndx=min(nr,nd-ndp); % number of frames to use 0233 dpi=ndp/(ndp+ndx)*dpi+sum(yp(1:ndx,:),1)/(ndp+ndx); 0234 ndp=ndp+ndx; 0235 end 0236 g=zeros(nr,nf2); % create space for gain matrix 0237 x=zeros(nr,nf2); % create space for prior SNR 0238 dp=zeros(nr,nf2); % create space for noise power spectrum estimate 0239 switch qq.lg 0240 case 0 % use amplitude domain estimator from [1] 0241 for i=1:nr 0242 ypi=yp(i,:); 0243 gami=max(min(ypi./dpi,gx),gz); % gamma = posterior SNR 0244 xi=max(a*xb*xu+(1-a)*max(gami-1,gn1),xn); % prior SNR 0245 if sum(gami.*xi./(1+xi)-log(1+xi))<le*nf2 % noise frame 0246 dpi=dpi*an+(1-an)*ypi; 0247 end 0248 dp(i,:)=dpi; % only required if noise estimate output is requested 0249 v=0.5*xi.*gami./(1+xi); % note that this is 0.5*vk in [1] 0250 gi=(0.277+2*v)./gami; % accurate to 0.02 dB for v>0.5 0251 mv=v<0.5; 0252 if any(mv) 0253 vmv=v(mv); 0254 gi(mv)=kk*sqrt(vmv).*((0.5+vmv).*besseli(0,vmv)+vmv.*besseli(1,vmv))./(gami(mv).*exp(vmv)); 0255 end 0256 g(i,:)=gi; % save gain for later 0257 x(i,:)=xi; % save prior SNR 0258 xu=gami.*gi.^2; % unsmoothed prior SNR 0259 end 0260 case 2 % perceptually motivated estimator from [7] 0261 for i=1:nr 0262 ypi=yp(i,:); 0263 gami=max(min(ypi./dpi,gx),gz); % gamma = posterior SNR 0264 xi=max(a*xb*xu+(1-a)*max(gami-1,gn1),xn); % prior SNR 0265 if sum(gami.*xi./(1+xi)-log(1+xi))<le*nf2 % noise frame 0266 dpi=dpi*an+(1-an)*ypi; 0267 end 0268 v=0.5*xi.*gami./(1+xi); % note that this is 0.5*vk in [7] 0269 gi=cc*sqrt(v).*exp(v)./(gami.*besseli(0,v)); 0270 g(i,:)=gi; % save gain for later 0271 x(i,:)=xi; % save prior SNR 0272 xu=gami.*gi.^2; % unsmoothed prior SNR 0273 end 0274 otherwise % use log domain estimator from [2] 0275 for i=1:nr 0276 ypi=yp(i,:); 0277 gami=max(min(ypi./dpi,gx),gz); % gamma = posterior SNR 0278 xi=max(a*xb*xu+(1-a)*max(gami-1,gn1),xn); % prior SNR 0279 xir=xi./(1+xi); 0280 if sum(gami.*xir-log(1+xi))<le*nf2 % noise frame 0281 dpi=dpi*an+(1-an)*ypi; 0282 end 0283 gi=xir.*exp(0.5*expint(xir.*gami)); 0284 g(i,:)=gi; % save gain for later 0285 x(i,:)=xi; % save prior SNR 0286 xu=gami.*gi.^2; % unsmoothed prior SNR 0287 end 0288 end 0289 if qq.bt>=0 0290 g=g>qq.bt; 0291 end 0292 g=qq.mx+(1-qq.mx)*g; % mix in some of the input 0293 se=(v_irfft((yf.*g).',nf).').*repmat(w,nr,1); % inverse dft and apply output window 0294 ss=zeros(ni*(nr+no-1),no); % space for overlapped output speech 0295 ss(1:ni*(no-1),end)=ssv; 0296 for i=1:no 0297 nm=nf*(1+floor((nr-i)/no)); % number of samples in this set 0298 ss(1+(i-1)*ni:nm+(i-1)*ni,i)=reshape(se(i:no:nr,:)',nm,1); 0299 end 0300 ss=sum(ss,2); 0301 if nargout>2 && ~isempty(tf) 0302 gg=zeros(nr,nf2,length(tf)); % make space 0303 for i=1:length(tf) 0304 switch tf(i) 0305 case 'i' % 'i' = input power spectrum 0306 gg(:,:,i)=yp; 0307 case 'I' % 'i' = input power spectrum 0308 gg(:,:,i)=yf; 0309 case 'n' % 'n' = noise power spectrum 0310 gg(:,:,i)=dp; 0311 case 'z' % 'z' = posterior SNR (i.e. (S+N)/N ) 0312 gg(:,:,i)=gam; 0313 case 'x' % 'x' = prior SNR 0314 gg(:,:,i)=x; 0315 case 'g' % 'g' = gain 0316 gg(:,:,i)=g; 0317 case 'o' % 'o' = output power spectrum 0318 gg(:,:,i)=yp.*g.^2; 0319 case 'O' % 'o' = output power spectrum 0320 gg(:,:,i)=yf.*g; 0321 end 0322 end 0323 end 0324 end 0325 if nargout==2 || nargout==5 0326 if nr 0327 zo.ssv=ss(end-ni*(no-1)+1:end); % save the output tail for next time 0328 ss(end-ni*(no-1)+1:end)=[]; % only output the frames that are completed 0329 else 0330 zo.ssv=ssv; % 0331 end 0332 zo.si=s(length(ss)+1:end); % save the tail end of the input speech signal 0333 zo.fs=fs; % save sample frequency 0334 zo.qq=qq; % save local parameters 0335 zo.qp=qp; % save v_estnoisem parameters 0336 zo.xu=xu; 0337 zo.dpi=dpi; 0338 zo.ndp=ndp; 0339 if nargout==2 0340 gg=zo; % 2nd of two arguments is zo 0341 end 0342 elseif rf==0 0343 ss=ss(1:length(s)); % trim to the correct length if not an exact number of frames 0344 end 0345 if ~nargout && nr>0 0346 ffax=ff/1000; 0347 ax=zeros(4,1); 0348 ax(1)=subplot(223); 0349 imagesc(tt,ffax,20*log10(g)'); 0350 colorbar; 0351 axis('xy'); 0352 title(sprintf('Filter Gain (dB): ta=%.2g',qq.ta)); 0353 xlabel('Time (s)'); 0354 ylabel('Frequency (kHz)'); 0355 0356 ax(2)=subplot(222); 0357 imagesc(tt,ffax,10*log10(yp)'); 0358 colorbar; 0359 axis('xy'); 0360 title('Noisy Speech (dB)'); 0361 xlabel('Time (s)'); 0362 ylabel('Frequency (kHz)'); 0363 0364 ax(3)=subplot(224); 0365 imagesc(tt,ffax,10*log10(yp.*g.^2)'); 0366 colorbar; 0367 axis('xy'); 0368 title('Enhanced Speech (dB)'); 0369 xlabel('Time (s)'); 0370 ylabel('Frequency (kHz)'); 0371 0372 ax(4)=subplot(221); 0373 imagesc(tt,ffax,10*log10(dp)'); 0374 colorbar; 0375 axis('xy'); 0376 title('Noise Estimate (dB)'); 0377 xlabel('Time (s)'); 0378 ylabel('Frequency (kHz)'); 0379 linkaxes(ax); 0380 end