v_snrseg

PURPOSE ^

V_SNRSEG Measure segmental and global SNR [SEG,GLO]=(S,R,FS,M,TF)

SYNOPSIS ^

function [seg,glo]=v_snrseg(s,r,fs,m,tf)

DESCRIPTION ^

V_SNRSEG Measure segmental and global SNR [SEG,GLO]=(S,R,FS,M,TF)

Usage: (1) seg=v_snrseg(s,r,fs);                  % s & r are noisy and clean signal
       (2) seg=v_snrseg(s,r,fs,'wz');             % no VAD or inerpolation used ['Vq' is default]
       (3) [seg,glo]=v_snrseg(s,r,fs,'Vq',0.03);  % 30 ms frames

 Inputs:    s  test signal
            r  reference signal
           fs  sample frequency (Hz)
            m  mode [default = 'Vq']
                 w = No VAD - use whole file
                 v = use sohn VAD to discard silent portions
                 V = use P.56-based VAD to discard silent portions [default]
                 a = A-weight the signals
                 b = weight signals by BS-468
                 q = use quadratic interpolation to remove delays +- 1 sample
                 z = do not do any alignment
                 p = plot results
           tf  frame increment [0.01]

 Outputs: seg = Segmental SNR in dB
          glo = Global SNR in dB (typically 7 dB greater than SNR-seg)

 This function compares a noisy signal, S, with a clean reference, R, and
 computes the segemntal signal-to-noise ratio (SNR) in dB. The signals,
 which must be of the same length, are split into non-overlapping frames
 of length TF (default 10 ms) and the SNR of each frame in dB is calculated.
 The segmental SNR is the average of these values, i.e.
         SEG = mean(10*log10(sum(Ri^2)/sum((Si-Ri)^2))
 where the mean is over frames and the sum runs over one particular frame.
 Two optional modifications can be made to this basic formula:

    (a) Frames are excluded if there is no significant energy in the R
        signal. The idea is to limit the calculation to frames in which
        speech is active. By default, the v_voicebox function "v_activlev" is
        used to detect the inactive frames (the 'V' mode option).

    (b) In each frame independently, the reference signal is shifted by up
        to +- 1 sample to find the alignment than minimizes the noise
        component (S-R)^2. This shifting accounts for small misalignments
        and/or sample frequency differences between the two signals. For
        larger shifts, you can use the v_voicebox function "v_sigalign".
        Accurate alignemnt is especially important at high SNR values.

 If no M argument is specified, both these modifications will be applied;
 this is equivalent to specifying M='Vq'.

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SOURCE CODE ^

0001 function [seg,glo]=v_snrseg(s,r,fs,m,tf)
0002 %V_SNRSEG Measure segmental and global SNR [SEG,GLO]=(S,R,FS,M,TF)
0003 %
0004 %Usage: (1) seg=v_snrseg(s,r,fs);                  % s & r are noisy and clean signal
0005 %       (2) seg=v_snrseg(s,r,fs,'wz');             % no VAD or inerpolation used ['Vq' is default]
0006 %       (3) [seg,glo]=v_snrseg(s,r,fs,'Vq',0.03);  % 30 ms frames
0007 %
0008 % Inputs:    s  test signal
0009 %            r  reference signal
0010 %           fs  sample frequency (Hz)
0011 %            m  mode [default = 'Vq']
0012 %                 w = No VAD - use whole file
0013 %                 v = use sohn VAD to discard silent portions
0014 %                 V = use P.56-based VAD to discard silent portions [default]
0015 %                 a = A-weight the signals
0016 %                 b = weight signals by BS-468
0017 %                 q = use quadratic interpolation to remove delays +- 1 sample
0018 %                 z = do not do any alignment
0019 %                 p = plot results
0020 %           tf  frame increment [0.01]
0021 %
0022 % Outputs: seg = Segmental SNR in dB
0023 %          glo = Global SNR in dB (typically 7 dB greater than SNR-seg)
0024 %
0025 % This function compares a noisy signal, S, with a clean reference, R, and
0026 % computes the segemntal signal-to-noise ratio (SNR) in dB. The signals,
0027 % which must be of the same length, are split into non-overlapping frames
0028 % of length TF (default 10 ms) and the SNR of each frame in dB is calculated.
0029 % The segmental SNR is the average of these values, i.e.
0030 %         SEG = mean(10*log10(sum(Ri^2)/sum((Si-Ri)^2))
0031 % where the mean is over frames and the sum runs over one particular frame.
0032 % Two optional modifications can be made to this basic formula:
0033 %
0034 %    (a) Frames are excluded if there is no significant energy in the R
0035 %        signal. The idea is to limit the calculation to frames in which
0036 %        speech is active. By default, the v_voicebox function "v_activlev" is
0037 %        used to detect the inactive frames (the 'V' mode option).
0038 %
0039 %    (b) In each frame independently, the reference signal is shifted by up
0040 %        to +- 1 sample to find the alignment than minimizes the noise
0041 %        component (S-R)^2. This shifting accounts for small misalignments
0042 %        and/or sample frequency differences between the two signals. For
0043 %        larger shifts, you can use the v_voicebox function "v_sigalign".
0044 %        Accurate alignemnt is especially important at high SNR values.
0045 %
0046 % If no M argument is specified, both these modifications will be applied;
0047 % this is equivalent to specifying M='Vq'.
0048 
0049 % Bugs/suggestions
0050 % (1) Optionally restrict the bandwidth to the smaller of the two
0051 %     bandwidths either with an extra parameter or automatically determined
0052 
0053 %      Copyright (C) Mike Brookes 2011
0054 %      Version: $Id: v_snrseg.m 10865 2018-09-21 17:22:45Z dmb $
0055 %
0056 %   VOICEBOX is a MATLAB toolbox for speech processing.
0057 %   Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html
0058 %
0059 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0060 %   This program is free software; you can redistribute it and/or modify
0061 %   it under the terms of the GNU General Public License as published by
0062 %   the Free Software Foundation; either version 2 of the License, or
0063 %   (at your option) any later version.
0064 %
0065 %   This program is distributed in the hope that it will be useful,
0066 %   but WITHOUT ANY WARRANTY; without even the implied warranty of
0067 %   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
0068 %   GNU General Public License for more details.
0069 %
0070 %   You can obtain a copy of the GNU General Public License from
0071 %   http://www.gnu.org/copyleft/gpl.html or by writing to
0072 %   Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA.
0073 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0074 
0075 if nargin<4 || ~ischar(m)
0076     m='Vq';
0077 end
0078 if nargin<5 || ~numel(tf)
0079     tf=0.01; % default frame length is 10 ms
0080 end
0081 snmax=100;  % clipping limit for SNR
0082 
0083 % filter the input signals if required
0084 
0085 if any(m=='a')  % A-weighting
0086     [b,a]=v_stdspectrum(2,'z',fs);
0087     s=filter(b,a,s);
0088     r=filter(b,a,r);
0089 elseif any(m=='b') %  BS-468 weighting
0090     [b,a]=v_stdspectrum(8,'z',fs);
0091     s=filter(b,a,s);
0092     r=filter(b,a,r);
0093 end
0094 
0095 mq=~any(m=='z');
0096 nr=min(length(r), length(s));
0097 kf=round(tf*fs); % length of frame in samples
0098 ifr=kf+mq:kf:nr-mq; % ending sample of each frame
0099 ifl=ifr(end);
0100 nf=numel(ifr); % number of frames
0101 if mq % perform interpolation
0102     ssm=reshape(s(2:ifl)-s(3:ifl+1),kf,nf);
0103     ssp=reshape(s(2:ifl)-s(1:ifl-1),kf,nf);
0104     sr=reshape(s(2:ifl)-r(2:ifl),kf,nf);
0105     am=min(max(sum(sr.*ssm,1)./sum(ssm.^2,1),0),1); %optimum dist between s(2:ifl) and s(3:ifl+1)
0106     ap=min(max(sum(sr.*ssp,1)./sum(ssp.^2,1),0),1); %optimum dist between s(2:ifl) and s(1:ifl-1)
0107     ef=min(sum((sr-repmat(am,kf,1).*ssm).^2,1),sum((sr-repmat(ap,kf,1).*ssp).^2,1)); % select the best for each frame
0108 else % no interpolation
0109     ef=sum(reshape((s(1:ifl)-r(1:ifl)).^2,kf,nf),1);
0110 end
0111 rf=sum(reshape(r(mq+1:ifl).^2,kf,nf),1);
0112 em=ef==0; % mask for zero noise frames
0113 rm=rf==0; % mask for zero reference frames
0114 snf=10*log10((rf+rm)./(ef+em));
0115 snf(rm)=-snmax;
0116 snf(em)=snmax;
0117 
0118 % select the frames to include
0119 
0120 if any(m=='w')
0121     vf=true(1,nf); % include all frames
0122 elseif any(m=='v')
0123     vs=v_vadsohn(r,fs,'na');
0124     nvs=length(vs);
0125     [vss,vix]=sort([ifr'; vs(:,2)]);
0126     vjx=zeros(nvs+nf,5);
0127     vjx(vix,1)=(1:nvs+nf)'; % sorted position
0128     vjx(1:nf,2)=vjx(1:nf,1)-(1:nf)'; % prev VAD frame end (or 0 or nvs+1 if none)
0129     vjx(nf+1:end,2)=vjx(nf+1:end,1)-(1:nvs)'; % prev snr frame end (or 0 or nvs+1 if none)
0130     dvs=[vss(1)-mq; vss(2:end)-vss(1:end-1)];  % number of samples from previous frame boundary
0131     vjx(:,3)=dvs(vjx(:,1)); % number of samples from previous frame boundary
0132     vjx(1:nf,4)=vs(min(1+vjx(1:nf,2),nvs),3); % VAD result for samples between prev frame boundary and this one
0133     vjx(nf+1:end,4)=vs(:,3); % VAD result for samples between prev frame boundary and this one
0134     vjx(1:nf,5)=1:nf; % SNR frame to accumulate into
0135     vjx(vjx(nf+1:end,2)>=nf,3)=0;  % zap any VAD frame beyond the last snr frame
0136     vjx(nf+1:end,5)=min(vjx(nf+1:end,2)+1,nf); % SNR frame to accumulate into
0137     vf=full(sparse(1,vjx(:,5),vjx(:,3).*vjx(:,4),1,nf))>kf/2; % accumulate into SNR frames and compare with threshold
0138 else  % default is 'V'
0139     [lev,af,fso,vad]=v_activlev(r,fs);    % do VAD on reference signal
0140     vf=sum(reshape(vad(mq+1:ifl),kf,nf),1)>kf/2; % find frames that are mostly active
0141 end
0142 seg=mean(snf(vf));
0143 glo=10*log10(sum(rf(vf))/sum(ef(vf)));
0144 
0145 if ~nargout || any (m=='p')
0146     subplot(311);
0147     plot((1:length(s))/fs,s);
0148     ylabel('Signal');
0149     title(sprintf('SNR = %.1f dB, SNR_{seg} = %.1f dB',glo,seg));
0150     axh(1)=gca;
0151     subplot(312);
0152     plot((1:length(r))/fs,r);
0153     ylabel('Reference');
0154     axh(2)=gca;
0155     subplot(313);
0156     snv=snf;
0157     snv(~vf)=NaN;
0158     snu=snf;
0159     snu(vf>0)=NaN;
0160     plot([1 nr]/fs,[glo seg; glo seg],':k',((1:nf)*kf+(1-kf)/2)/fs,snv,'-b',((1:nf)*kf+(1-kf)/2)/fs,snu,'-r');
0161     ylabel('Frame SNR');
0162     xlabel('Time (s)');
0163     axh(3)=gca;
0164     linkaxes(axh,'x');
0165 end

Generated by m2html © 2003