v_ssubmmsev

PURPOSE ^

V_SSUBMMSE performs speech enhancement using mmse estimate of spectral amplitude or log amplitude [SS,ZO]=(S,FSZ,P)

SYNOPSIS ^

function [ss,gg,tt,ff,zo]=v_ssubmmsev(si,fsz,pp)

DESCRIPTION ^

V_SSUBMMSE performs speech enhancement using mmse estimate of spectral amplitude or log amplitude [SS,ZO]=(S,FSZ,P)

 Usage: y=v_ssubmmsev(x,fs);   % enhance the speech using default parameters

 Inputs:
   si      input speech signal
   fsz     sample frequency in Hz
           Alternatively, the input state from a previous call (see below)
   pp      algorithm parameters [optional]

 Outputs:
   ss        output enhanced speech
   gg(t,f,i) selected time-frequency values (see pp.tf below)
   tt        centre of frames (in seconds)
   ff        centre of frequency bins (in Hz)
   zo        output state (or the 2nd argument if gg,tt,ff are omitted)

 The algorithm operation is controlled by a small number of parameters:

        pp.of          % overlap factor = (fft length)/(frame increment) [2]
        pp.ti          % desired frame increment [0.016 seconds]
        pp.ri          % set to 1 to round ti to the nearest power of 2 samples [0]
        pp.ta          % time const for smoothing SNR estimate [0.396 seconds]
        pp.gx          % maximum posterior SNR as a power ratio [1000 = +30dB]
        pp.gn          % min posterior SNR as a power ratio when estimating prior SNR [1 = 0dB]
        pp.gz          % min posterior SNR as a power ratio [0.001 = -30dB]
        pp.xn          % minimum prior SNR [0]
        pp.xb          % bias compensation factor for prior SNR [1]
        pp.lg          % MMSE target: 0=amplitude, 1=log amplitude, 2=perceptual Bayes [1]
        pp.tn;         % smoothing time constant for noise estimation [0.5 s]
        pp.le;         % VAD threshold: log(p/(1-p)) where p is speech prob in a freq bin; use -Inf to prevent updating [0.15 (=>p=0.54)]
        pp.tx;         % initial noise interval [0.06 s]
        pp.ne          % noise estimation: 0=min statistics, 1=MMSE [0]
        pp.bt          % threshold for binary gain or -1 for continuous gain [-1]
        pp.mx          % input mixture gain [0]
        pp.rf          % round output signal to an exact number of frames [0]
        pp.tf          % selects time-frequency planes to output in the gg() variable ['g']
                           'i' = input power spectrum
                           'I' = input complex spectrum
                           'n' = noise power spectrum
                           'z' = "posterior" SNR (i.e. (S+N)/N )
                           'x' = "prior" SNR (i.e. S/N )
                           'g' = gain
                           'o' = output power spectrum
                           'O' = output complex spectrum

 The applied gain is mx+(1-mx)*optgain where optgain is calculated according to [1] or [2].
 If pp.bt>=0 then optgain is first thresholded with pp.bt to produce a binary gain 0 or 1.

 The default parameters implement the original algorithm in [1,2].

 Several parameters relate to the estimation of xi, the so-called "prior SNR",

             xi=max(a*pp.xb*xu+(1-a)*max(gami-1,pp.gn-1),pp.xn);

 This is estimated as a smoothed version of 1 less than gami, the "posterior SNR"
 which is the noisy speech power divided by the noise power. This is
 clipped to a min of (pp.gn-1), smoothed using a factor "a" which corresponds to a
 time-constant of pp.ta and then clipped to a minimum of pp.xn. The
 previous value is taken to be pp.xb*xu where xu is the ratio of the
 estimated speech amplitude squared to the noise power.

 The noise estimation uses a VAD from equation (4) in [6] and recursively updates 
 the noise spectrum only in frames that are classified as noise-only.

 If convenient, you can call v_specsub in chunks of arbitrary size. Thus the following are equivalent:

                   (a) y=v_ssubmmse(s,fs);

                   (b) [y1,z]=v_ssubmmse(s(1:1000),fs);
                       [y2,z]=v_ssubmmse(s(1001:2000),z);
                       y3=v_ssubmmse(s(2001:end),z);
                       y=[y1; y2; y3];

 If the number of output arguments is either 2 or 5, the last partial frame of samples will
 be retained for overlap adding with the output from the next call to v_ssubmmse().

 See also v_specsub() for an alternative gain function

 Refs:
    [1] Ephraim, Y. & Malah, D.
        Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator
        IEEE Trans Acoustics Speech and Signal Processing, 32(6):1109-1121, Dec 1984
    [2] Ephraim, Y. & Malah, D.
        Speech enhancement using a minimum mean-square error log-spectral amplitude estimator
        IEEE Trans Acoustics Speech and Signal Processing, 33(2):443-445, Apr 1985
    [3] Rainer Martin.
        Noise power spectral density estimation based on optimal smoothing and minimum statistics.
        IEEE Trans. Speech and Audio Processing, 9(5):504-512, July 2001.
    [4] O. Cappe.
        Elimination of the musical noise phenomenon with the ephraim and malah noise suppressor.
        IEEE Trans Speech Audio Processing, 2 (2): 345-349, Apr. 1994. doi: 10.1109/89.279283.
    [5] J. Erkelens, J. Jensen, and R. Heusdens.
        A data-driven approach to optimizing spectral speech enhancement methods for various error criteria.
        Speech Communication, 49: 530-541, 2007. doi: 10.1016/j.specom.2006.06.012.
    [6] J. Sohn, N. S. Kim, and W. Sung.
        A statistical model-based voice activity detection.
        IEEE Signal Processing Lett., 6 (1): 1-3, 1999. doi: 10.1109/97.736233.
    [7] Loizou, P.
        Speech enhancement based on perceptually motivated Bayesian estimators of the speech magnitude spectrum.
        IEEE Trans. Speech and Audio Processing, 13(5), 857-869, 2005.

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SOURCE CODE ^

0001 function [ss,gg,tt,ff,zo]=v_ssubmmsev(si,fsz,pp)
0002 %V_SSUBMMSE performs speech enhancement using mmse estimate of spectral amplitude or log amplitude [SS,ZO]=(S,FSZ,P)
0003 %
0004 % Usage: y=v_ssubmmsev(x,fs);   % enhance the speech using default parameters
0005 %
0006 % Inputs:
0007 %   si      input speech signal
0008 %   fsz     sample frequency in Hz
0009 %           Alternatively, the input state from a previous call (see below)
0010 %   pp      algorithm parameters [optional]
0011 %
0012 % Outputs:
0013 %   ss        output enhanced speech
0014 %   gg(t,f,i) selected time-frequency values (see pp.tf below)
0015 %   tt        centre of frames (in seconds)
0016 %   ff        centre of frequency bins (in Hz)
0017 %   zo        output state (or the 2nd argument if gg,tt,ff are omitted)
0018 %
0019 % The algorithm operation is controlled by a small number of parameters:
0020 %
0021 %        pp.of          % overlap factor = (fft length)/(frame increment) [2]
0022 %        pp.ti          % desired frame increment [0.016 seconds]
0023 %        pp.ri          % set to 1 to round ti to the nearest power of 2 samples [0]
0024 %        pp.ta          % time const for smoothing SNR estimate [0.396 seconds]
0025 %        pp.gx          % maximum posterior SNR as a power ratio [1000 = +30dB]
0026 %        pp.gn          % min posterior SNR as a power ratio when estimating prior SNR [1 = 0dB]
0027 %        pp.gz          % min posterior SNR as a power ratio [0.001 = -30dB]
0028 %        pp.xn          % minimum prior SNR [0]
0029 %        pp.xb          % bias compensation factor for prior SNR [1]
0030 %        pp.lg          % MMSE target: 0=amplitude, 1=log amplitude, 2=perceptual Bayes [1]
0031 %        pp.tn;         % smoothing time constant for noise estimation [0.5 s]
0032 %        pp.le;         % VAD threshold: log(p/(1-p)) where p is speech prob in a freq bin; use -Inf to prevent updating [0.15 (=>p=0.54)]
0033 %        pp.tx;         % initial noise interval [0.06 s]
0034 %        pp.ne          % noise estimation: 0=min statistics, 1=MMSE [0]
0035 %        pp.bt          % threshold for binary gain or -1 for continuous gain [-1]
0036 %        pp.mx          % input mixture gain [0]
0037 %        pp.rf          % round output signal to an exact number of frames [0]
0038 %        pp.tf          % selects time-frequency planes to output in the gg() variable ['g']
0039 %                           'i' = input power spectrum
0040 %                           'I' = input complex spectrum
0041 %                           'n' = noise power spectrum
0042 %                           'z' = "posterior" SNR (i.e. (S+N)/N )
0043 %                           'x' = "prior" SNR (i.e. S/N )
0044 %                           'g' = gain
0045 %                           'o' = output power spectrum
0046 %                           'O' = output complex spectrum
0047 %
0048 % The applied gain is mx+(1-mx)*optgain where optgain is calculated according to [1] or [2].
0049 % If pp.bt>=0 then optgain is first thresholded with pp.bt to produce a binary gain 0 or 1.
0050 %
0051 % The default parameters implement the original algorithm in [1,2].
0052 %
0053 % Several parameters relate to the estimation of xi, the so-called "prior SNR",
0054 %
0055 %             xi=max(a*pp.xb*xu+(1-a)*max(gami-1,pp.gn-1),pp.xn);
0056 %
0057 % This is estimated as a smoothed version of 1 less than gami, the "posterior SNR"
0058 % which is the noisy speech power divided by the noise power. This is
0059 % clipped to a min of (pp.gn-1), smoothed using a factor "a" which corresponds to a
0060 % time-constant of pp.ta and then clipped to a minimum of pp.xn. The
0061 % previous value is taken to be pp.xb*xu where xu is the ratio of the
0062 % estimated speech amplitude squared to the noise power.
0063 %
0064 % The noise estimation uses a VAD from equation (4) in [6] and recursively updates
0065 % the noise spectrum only in frames that are classified as noise-only.
0066 %
0067 % If convenient, you can call v_specsub in chunks of arbitrary size. Thus the following are equivalent:
0068 %
0069 %                   (a) y=v_ssubmmse(s,fs);
0070 %
0071 %                   (b) [y1,z]=v_ssubmmse(s(1:1000),fs);
0072 %                       [y2,z]=v_ssubmmse(s(1001:2000),z);
0073 %                       y3=v_ssubmmse(s(2001:end),z);
0074 %                       y=[y1; y2; y3];
0075 %
0076 % If the number of output arguments is either 2 or 5, the last partial frame of samples will
0077 % be retained for overlap adding with the output from the next call to v_ssubmmse().
0078 %
0079 % See also v_specsub() for an alternative gain function
0080 %
0081 % Refs:
0082 %    [1] Ephraim, Y. & Malah, D.
0083 %        Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator
0084 %        IEEE Trans Acoustics Speech and Signal Processing, 32(6):1109-1121, Dec 1984
0085 %    [2] Ephraim, Y. & Malah, D.
0086 %        Speech enhancement using a minimum mean-square error log-spectral amplitude estimator
0087 %        IEEE Trans Acoustics Speech and Signal Processing, 33(2):443-445, Apr 1985
0088 %    [3] Rainer Martin.
0089 %        Noise power spectral density estimation based on optimal smoothing and minimum statistics.
0090 %        IEEE Trans. Speech and Audio Processing, 9(5):504-512, July 2001.
0091 %    [4] O. Cappe.
0092 %        Elimination of the musical noise phenomenon with the ephraim and malah noise suppressor.
0093 %        IEEE Trans Speech Audio Processing, 2 (2): 345-349, Apr. 1994. doi: 10.1109/89.279283.
0094 %    [5] J. Erkelens, J. Jensen, and R. Heusdens.
0095 %        A data-driven approach to optimizing spectral speech enhancement methods for various error criteria.
0096 %        Speech Communication, 49: 530-541, 2007. doi: 10.1016/j.specom.2006.06.012.
0097 %    [6] J. Sohn, N. S. Kim, and W. Sung.
0098 %        A statistical model-based voice activity detection.
0099 %        IEEE Signal Processing Lett., 6 (1): 1-3, 1999. doi: 10.1109/97.736233.
0100 %    [7] Loizou, P.
0101 %        Speech enhancement based on perceptually motivated Bayesian estimators of the speech magnitude spectrum.
0102 %        IEEE Trans. Speech and Audio Processing, 13(5), 857-869, 2005.
0103 
0104 % Bugs/suggestions:
0105 %   (1) sort out behaviour when si() is a matrix rather than a vector
0106 %
0107 %      Copyright (C) Mike Brookes 2004-2011
0108 %      Version: $Id: v_ssubmmsev.m 10865 2018-09-21 17:22:45Z dmb $
0109 %
0110 %   VOICEBOX is a MATLAB toolbox for speech processing.
0111 %   Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html
0112 %
0113 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0114 %   This program is free software; you can redistribute it and/or modify
0115 %   it under the terms of the GNU General Public License as published by
0116 %   the Free Software Foundation; either version 2 of the License, or
0117 %   (at your option) any later version.
0118 %
0119 %   This program is distributed in the hope that it will be useful,
0120 %   but WITHOUT ANY WARRANTY; without even the implied warranty of
0121 %   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
0122 %   GNU General Public License for more details.
0123 %
0124 %   You can obtain a copy of the GNU General Public License from
0125 %   http://www.gnu.org/copyleft/gpl.html or by writing to
0126 %   Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA.
0127 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0128 persistent kk cc
0129 if ~numel(kk)
0130     kk=sqrt(2*pi);      % sqrt(8)*Gamma(1.5) - required constant
0131     cc=sqrt(2/pi);      %sqrt(2)/Gamma(0.5)
0132 end
0133 if numel(si)>length(si)
0134     error('Input speech signal must be a vector not a matrix');
0135 end
0136 if isstruct(fsz)
0137     fs=fsz.fs;
0138     qq=fsz.qq;
0139     qp=fsz.qp;
0140     ze=fsz.ze;
0141     s=zeros(length(fsz.si)+length(si(:)),1); % allocate space for speech
0142     s(1:length(fsz.si))=fsz.si;
0143     s(length(fsz.si)+1:end)=si(:);
0144 else
0145     fs=fsz;     % sample frequency
0146     s=si(:);
0147     % default algorithm constants
0148     
0149     qq.of=2;        % overlap factor = (fft length)/(frame increment)
0150     qq.ti=16e-3;    % desired frame increment (16 ms)
0151     qq.ri=0;        % round ni to the nearest power of 2
0152     qq.ta=0.396;    % Time const for smoothing SNR estimate = -tinc/log(0.98) from [1]
0153     qq.gx=1000;     % maximum posterior SNR = 30dB
0154     qq.gn=1;        % min posterior SNR as a power ratio when estimating prior SNR [1]
0155     qq.gz=0.001;    % min posterior SNR as a power ratio [0.001 = -30dB]
0156     qq.xn=0;        % minimum prior SNR = -Inf dB
0157     qq.xb=1;        % bias compensation factor for prior SNR [1]
0158     qq.lg=1;        % use log-domain estimator by default
0159     qq.ne=0;        % noise estimation: 0=min statistics, 1=MMSE [0]
0160     qq.bt=-1;       % suppress binary masking
0161     qq.mx=0;        % no input mixing
0162     qq.tf='g';      % output the gain time-frequency plane by default
0163     qq.rf=0;
0164     qq.tn=0.5;     % smoothing constant for noise estimation [500 ms]
0165     qq.le=0.15;    % VAD threshold; use -Inf to prevent updating
0166     qq.tx=0.06;    % initial noise interval [60 ms]
0167     if nargin>=3 && ~isempty(pp)
0168         qp=pp;      % save for v_estnoisem call
0169         qqn=fieldnames(qq);
0170         for i=1:length(qqn)
0171             if isfield(pp,qqn{i})
0172                 qq.(qqn{i})=pp.(qqn{i});
0173             end
0174         end
0175     else
0176         qp=struct;  % make an empty structure
0177     end
0178 end
0179 % derived algorithm constants
0180 if qq.ri
0181     ni=pow2(nextpow2(ti*fs*sqrt(0.5)));
0182 else
0183     ni=round(qq.ti*fs);    % frame increment in samples
0184 end
0185 tinc=ni/fs;         % true frame increment time
0186 a=exp(-tinc/qq.ta); % SNR smoothing coefficient
0187 gx=qq.gx;           % max posterior SNR as a power ratio
0188 gz=qq.gz;           % min posterior SNR as a power ratio
0189 xn=qq.xn;           % floor for prior SNR, xi
0190 ne=qq.ne;           % noise estimation: 0=min statistics, 1=MMSE [0]
0191 gn1=max(qq.gn-1,0); % floor for posterior SNR when estimating prior SNR
0192 le=qq.le;
0193 xb=qq.xb;
0194 tf=qq.tf;
0195 rf=qq.rf || nargout==2 || nargout==5;            % round down to an exact number of frames
0196 nd=max(1,round(qq.tx/tinc)); % number of frames for initial noise estimate
0197 an=exp(-tinc/qq.tn); % Noise spectrum smoothing coefficient
0198 
0199 % calculate power spectrum in frames
0200 
0201 no=round(qq.of);                      % integer overlap factor
0202 nf=ni*no;                           % fft length
0203 w=sqrt(hamming(nf+1))'; w(end)=[];  % for now always use sqrt hamming window
0204 w=w/sqrt(sum(w(1:ni:nf).^2));       % normalize to give overall gain of 1
0205 if rf>0
0206     rfm='';                         % truncated input to an exact number of frames
0207 else
0208     rfm='r';
0209 end
0210 [y,tt]=v_enframe(s,w,ni,rfm);
0211 tt=tt/fs;                           % frame times
0212 yf=v_rfft(y,nf,2);
0213 yp=yf.*conj(yf);                    % power spectrum of input speech
0214 [nr,nf2]=size(yp);                  % number of frames
0215 ff=(0:nf2-1)*fs/nf;
0216 if isstruct(fsz)
0217     ndp=fsz.ndp;
0218     dpi=fsz.dpi;
0219     ssv=fsz.ssv;
0220     xu=fsz.xu;                      % saved unsmoothed SNR
0221 else
0222     dpi=zeros(1,nf2);   % noise estimate
0223     ndp=0;              % noise estimate based on ndp frames
0224     ssv=zeros(ni*(no-1),1);            % dummy saved overlap
0225     xu=1;                           % dummy unsmoothed SNR from previous frame
0226 end
0227 if ~nr                                 % no data frames
0228     ss=[];
0229     gg=[];
0230 else
0231     if ndp<nd
0232         ndx=min(nr,nd-ndp);         % number of frames to use
0233         dpi=ndp/(ndp+ndx)*dpi+sum(yp(1:ndx,:),1)/(ndp+ndx);
0234         ndp=ndp+ndx;
0235     end
0236     g=zeros(nr,nf2);                % create space for gain matrix
0237     x=zeros(nr,nf2);                % create space for prior SNR
0238     dp=zeros(nr,nf2);               % create space for noise power spectrum estimate
0239     switch qq.lg
0240         case 0                      % use amplitude domain estimator from [1]
0241             for i=1:nr
0242                 ypi=yp(i,:);
0243                 gami=max(min(ypi./dpi,gx),gz);     % gamma = posterior SNR
0244                 xi=max(a*xb*xu+(1-a)*max(gami-1,gn1),xn);  % prior SNR
0245                 if sum(gami.*xi./(1+xi)-log(1+xi))<le*nf2 % noise frame
0246                     dpi=dpi*an+(1-an)*ypi;
0247                 end
0248                 dp(i,:)=dpi;  % only required if noise estimate output is requested
0249                 v=0.5*xi.*gami./(1+xi);    % note that this is 0.5*vk in [1]
0250                 gi=(0.277+2*v)./gami;     % accurate to 0.02 dB for v>0.5
0251                 mv=v<0.5;
0252                 if any(mv)
0253                     vmv=v(mv);
0254                     gi(mv)=kk*sqrt(vmv).*((0.5+vmv).*besseli(0,vmv)+vmv.*besseli(1,vmv))./(gami(mv).*exp(vmv));
0255                 end
0256                 g(i,:)=gi;              % save gain for later
0257                 x(i,:)=xi;              % save prior SNR
0258                 xu=gami.*gi.^2;         % unsmoothed prior SNR
0259             end
0260         case 2                          % perceptually motivated estimator from [7]
0261             for i=1:nr
0262                 ypi=yp(i,:);
0263                 gami=max(min(ypi./dpi,gx),gz);     % gamma = posterior SNR
0264                 xi=max(a*xb*xu+(1-a)*max(gami-1,gn1),xn);  % prior SNR
0265                 if sum(gami.*xi./(1+xi)-log(1+xi))<le*nf2 % noise frame
0266                     dpi=dpi*an+(1-an)*ypi;
0267                 end
0268                 v=0.5*xi.*gami./(1+xi);    % note that this is 0.5*vk in [7]
0269                 gi=cc*sqrt(v).*exp(v)./(gami.*besseli(0,v));
0270                 g(i,:)=gi;              % save gain for later
0271                 x(i,:)=xi;              % save prior SNR
0272                 xu=gami.*gi.^2;         % unsmoothed prior SNR
0273             end
0274         otherwise                       % use log domain estimator from [2]
0275             for i=1:nr
0276                 ypi=yp(i,:);
0277                 gami=max(min(ypi./dpi,gx),gz);     % gamma = posterior SNR
0278                 xi=max(a*xb*xu+(1-a)*max(gami-1,gn1),xn);  % prior SNR
0279                 xir=xi./(1+xi);
0280                 if sum(gami.*xir-log(1+xi))<le*nf2 % noise frame
0281                     dpi=dpi*an+(1-an)*ypi;
0282                 end
0283                 gi=xir.*exp(0.5*expint(xir.*gami));
0284                 g(i,:)=gi;                 % save gain for later
0285                 x(i,:)=xi;              % save prior SNR
0286                 xu=gami.*gi.^2;         % unsmoothed prior SNR
0287             end
0288     end
0289     if qq.bt>=0
0290         g=g>qq.bt;
0291     end
0292     g=qq.mx+(1-qq.mx)*g;                    % mix in some of the input
0293     se=(v_irfft((yf.*g).',nf).').*repmat(w,nr,1);     % inverse dft and apply output window
0294     ss=zeros(ni*(nr+no-1),no);                      % space for overlapped output speech
0295     ss(1:ni*(no-1),end)=ssv;
0296     for i=1:no
0297         nm=nf*(1+floor((nr-i)/no));         % number of samples in this set
0298         ss(1+(i-1)*ni:nm+(i-1)*ni,i)=reshape(se(i:no:nr,:)',nm,1);
0299     end
0300     ss=sum(ss,2);
0301     if nargout>2 && ~isempty(tf)
0302         gg=zeros(nr,nf2,length(tf));  % make space
0303         for i=1:length(tf)
0304             switch tf(i)
0305                 case 'i'            % 'i' = input power spectrum
0306                     gg(:,:,i)=yp;
0307                 case 'I'            % 'i' = input power spectrum
0308                     gg(:,:,i)=yf;
0309                 case 'n'            % 'n' = noise power spectrum
0310                     gg(:,:,i)=dp;
0311                 case 'z'            % 'z' = posterior SNR (i.e. (S+N)/N )
0312                     gg(:,:,i)=gam;
0313                 case 'x'            % 'x' = prior SNR
0314                     gg(:,:,i)=x;
0315                 case 'g'            % 'g' = gain
0316                     gg(:,:,i)=g;
0317                 case 'o'            % 'o' = output power spectrum
0318                     gg(:,:,i)=yp.*g.^2;
0319                 case 'O'            % 'o' = output power spectrum
0320                     gg(:,:,i)=yf.*g;
0321             end
0322         end
0323     end
0324 end
0325 if nargout==2 || nargout==5
0326     if nr
0327         zo.ssv=ss(end-ni*(no-1)+1:end);    % save the output tail for next time
0328         ss(end-ni*(no-1)+1:end)=[];        % only output the frames that are completed
0329     else
0330         zo.ssv=ssv;  %
0331     end
0332     zo.si=s(length(ss)+1:end);      % save the tail end of the input speech signal
0333     zo.fs=fs;                       % save sample frequency
0334     zo.qq=qq;                       % save local parameters
0335     zo.qp=qp;                       % save v_estnoisem parameters
0336     zo.xu=xu;
0337     zo.dpi=dpi;
0338     zo.ndp=ndp;
0339     if nargout==2
0340         gg=zo;                      % 2nd of two arguments is zo
0341     end
0342 elseif rf==0
0343     ss=ss(1:length(s));             % trim to the correct length if not an exact number of frames
0344 end
0345 if ~nargout && nr>0
0346     ffax=ff/1000;
0347     ax=zeros(4,1);
0348     ax(1)=subplot(223);
0349     imagesc(tt,ffax,20*log10(g)');
0350     colorbar;
0351     axis('xy');
0352     title(sprintf('Filter Gain (dB): ta=%.2g',qq.ta));
0353     xlabel('Time (s)');
0354     ylabel('Frequency (kHz)');
0355     
0356     ax(2)=subplot(222);
0357     imagesc(tt,ffax,10*log10(yp)');
0358     colorbar;
0359     axis('xy');
0360     title('Noisy Speech (dB)');
0361     xlabel('Time (s)');
0362     ylabel('Frequency (kHz)');
0363     
0364     ax(3)=subplot(224);
0365     imagesc(tt,ffax,10*log10(yp.*g.^2)');
0366     colorbar;
0367     axis('xy');
0368     title('Enhanced Speech (dB)');
0369     xlabel('Time (s)');
0370     ylabel('Frequency (kHz)');
0371     
0372     ax(4)=subplot(221);
0373     imagesc(tt,ffax,10*log10(dp)');
0374     colorbar;
0375     axis('xy');
0376     title('Noise Estimate (dB)');
0377     xlabel('Time (s)');
0378     ylabel('Frequency (kHz)');
0379     linkaxes(ax);
0380 end

Generated by m2html © 2003