V_SPECSUBM obsolete speech enhancement algorithm - use v_specsub instead implementation of spectral subtraction algorithm by R Martin (rather slow) algorithm parameters: t* in seconds, f* in Hz, k* dimensionless 1: tg = smoothing time constant for signal power estimate (0.04): high=reverberant, low=musical 2: ta = smoothing time constant for signal power estimate used in noise estimation (0.1) 3: tw = fft window length (will be rounded up to 2^nw samples) 4: tm = length of minimum filter (1.5): high=slow response to noise increase, low=distortion 5: to = time constant for oversubtraction factor (0.08) 6: fo = oversubtraction corner frequency (800): high=distortion, low=musical 7: km = number of minimisation buffers to use (4): high=waste memory, low=noise modulation 8: ks = oversampling constant (4) 9: kn = noise estimate compensation (1.5) 10:kf = subtraction floor (0.02): high=noisy, low=musical 11:ko = oversubtraction scale factor (4): high=distortion, low=musical Refs: (a) R. Martin. Spectral subtraction based on minimum statistics. In Proc EUSIPCO, pages 1182-1185, Edinburgh, Sept 1994. (b) R. Martin. Noise power spectral density estimation based on optimal smoothing and minimum statistics. IEEE Trans. Speech and Audio Processing, 9(5):504-512, July 2001.
0001 function [ss,po]=v_specsubm(s,fs,p) 0002 %V_SPECSUBM obsolete speech enhancement algorithm - use v_specsub instead 0003 % 0004 % implementation of spectral subtraction algorithm by R Martin (rather slow) 0005 % algorithm parameters: t* in seconds, f* in Hz, k* dimensionless 0006 % 1: tg = smoothing time constant for signal power estimate (0.04): high=reverberant, low=musical 0007 % 2: ta = smoothing time constant for signal power estimate 0008 % used in noise estimation (0.1) 0009 % 3: tw = fft window length (will be rounded up to 2^nw samples) 0010 % 4: tm = length of minimum filter (1.5): high=slow response to noise increase, low=distortion 0011 % 5: to = time constant for oversubtraction factor (0.08) 0012 % 6: fo = oversubtraction corner frequency (800): high=distortion, low=musical 0013 % 7: km = number of minimisation buffers to use (4): high=waste memory, low=noise modulation 0014 % 8: ks = oversampling constant (4) 0015 % 9: kn = noise estimate compensation (1.5) 0016 % 10:kf = subtraction floor (0.02): high=noisy, low=musical 0017 % 11:ko = oversubtraction scale factor (4): high=distortion, low=musical 0018 % 0019 % Refs: 0020 % (a) R. Martin. Spectral subtraction based on minimum statistics. In Proc EUSIPCO, pages 1182-1185, Edinburgh, Sept 1994. 0021 % (b) R. Martin. Noise power spectral density estimation based on optimal smoothing and minimum statistics. 0022 % IEEE Trans. Speech and Audio Processing, 9(5):504-512, July 2001. 0023 0024 0025 % Copyright (C) Mike Brookes 2004 0026 % Version: $Id: v_specsubm.m 10865 2018-09-21 17:22:45Z dmb $ 0027 % 0028 % VOICEBOX is a MATLAB toolbox for speech processing. 0029 % Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html 0030 % 0031 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0032 % This program is free software; you can redistribute it and/or modify 0033 % it under the terms of the GNU General Public License as published by 0034 % the Free Software Foundation; either version 2 of the License, or 0035 % (at your option) any later version. 0036 % 0037 % This program is distributed in the hope that it will be useful, 0038 % but WITHOUT ANY WARRANTY; without even the implied warranty of 0039 % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 0040 % GNU General Public License for more details. 0041 % 0042 % You can obtain a copy of the GNU General Public License from 0043 % http://www.gnu.org/copyleft/gpl.html or by writing to 0044 % Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA. 0045 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0046 0047 if nargin<3 po=[0.04 0.1 0.032 1.5 0.08 400 4 4 1.5 0.02 4].'; else po=p; end 0048 ns=length(s); 0049 ts=1/fs; 0050 ss=zeros(ns,1); 0051 0052 ni=pow2(nextpow2(fs*po(3)/po(8))); 0053 ti=ni/fs; 0054 nw=ni*po(8); 0055 nf=1+floor((ns-nw)/ni); 0056 nm=ceil(fs*po(4)/(ni*po(7))); 0057 0058 win=0.5*hamming(nw+1)/1.08;win(end)=[]; 0059 zg=exp(-ti/po(1)); 0060 za=exp(-ti/po(2)); 0061 zo=exp(-ti/po(5)); 0062 0063 px=zeros(1+nw/2,1); 0064 pxn=px; 0065 os=px; 0066 mb=ones(1+nw/2,po(7))*nw/2; 0067 im=0; 0068 osf=po(11)*(1+(0:nw/2).'*fs/(nw*po(6))).^(-1); 0069 0070 imidx=[13 21]'; 0071 x2im=zeros(length(imidx),nf); 0072 osim=x2im; 0073 pnim=x2im; 0074 pxnim=x2im; 0075 qim=x2im; 0076 0077 for is=1:nf 0078 idx=(1:nw)+(is-1)*ni; 0079 x=v_rfft(s(idx).*win); 0080 x2=x.*conj(x); 0081 0082 pxn=za*pxn+(1-za)*x2; 0083 im=rem(im+1,nm); 0084 if im 0085 mb(:,1)=min(mb(:,1),pxn); 0086 else 0087 mb=[pxn,mb(:,1:po(7)-1)]; 0088 end 0089 pn=po(9)*min(mb,[],2); 0090 %os= oversubtraction factor 0091 os=zo*os+(1-zo)*(1+osf.*pn./(pn+pxn)); 0092 0093 px=zg*px+(1-zg)*x2; 0094 q=max(po(10)*sqrt(pn./x2),1-sqrt(os.*pn./px)); 0095 ss(idx)=ss(idx)+v_irfft(x.*q); 0096 0097 end 0098 if nargout==0 0099 soundsc([s; ss],fs); 0100 end 0101 0102 0103 0104