Home > voicebox > melcepst.m

# melcepst

## PURPOSE

MELCEPST Calculate the mel cepstrum of a signal C=(S,FS,W,NC,P,N,INC,FL,FH)

## SYNOPSIS

function [c,tc]=melcepst(s,fs,w,nc,p,n,inc,fl,fh)

## DESCRIPTION

```MELCEPST Calculate the mel cepstrum of a signal C=(S,FS,W,NC,P,N,INC,FL,FH)

Simple use: (1) c=melcepst(s,fs)          % calculate mel cepstrum with 12 coefs, 256 sample frames
(2) c=melcepst(s,fs,'E0dD')   % include log energy, 0th cepstral coef, delta and delta-delta coefs

Inputs:
s      speech signal
fs  sample rate in Hz (default 11025)
w   mode string (see below)
nc  number of cepstral coefficients excluding 0'th coefficient [default 12]
p   number of filters in filterbank [default: floor(3*log(fs)) =  approx 2.1 per ocatave]
n   length of frame in samples [default power of 2 < (0.03*fs)]
inc frame increment [default n/2]
fl  low end of the lowest filter as a fraction of fs [default = 0]
fh  high end of highest filter as a fraction of fs [default = 0.5]

w   any sensible combination of the following:

'R'  rectangular window in time domain
'N'     Hanning window in time domain
'M'     Hamming window in time domain (default)

't'  triangular shaped filters in mel domain (default)
'n'  hanning shaped filters in mel domain
'm'  hamming shaped filters in mel domain

'p'     filters act in the power domain
'a'     filters act in the absolute magnitude domain (default)

'0'  include 0'th order cepstral coefficient
'E'  include log energy
'd'     include delta coefficients (dc/dt)
'D'     include delta-delta coefficients (d^2c/dt^2)

'z'  highest and lowest filters taper down to zero (default)
'y'  lowest filter remains at 1 down to 0 frequency and
highest filter remains at 1 up to nyquist freqency

If 'ty' or 'ny' is specified, the total power in the fft is preserved.

Outputs:    c     mel cepstrum output: one frame per row. Log energy, if requested, is the
first element of each row followed by the delta and then the delta-delta
coefficients.
tc    fractional time in samples at the centre of each frame
with the first sample being 1.```

## CROSS-REFERENCE INFORMATION

This function calls:
• enframe ENFRAME split signal up into (overlapping) frames: one per row. [F,T]=(X,WIN,HOP)
• melbankm MELBANKM determine matrix for a mel/erb/bark-spaced filterbank [X,MN,MX]=(P,N,FS,FL,FH,W)
• rdct RDCT Discrete cosine transform of real data Y=(X,N,A,B)
• rfft RFFT Calculate the DFT of real data Y=(X,N,D)
This function is called by:

## SOURCE CODE

```0001 function [c,tc]=melcepst(s,fs,w,nc,p,n,inc,fl,fh)
0002 %MELCEPST Calculate the mel cepstrum of a signal C=(S,FS,W,NC,P,N,INC,FL,FH)
0003 %
0004 %
0005 % Simple use: (1) c=melcepst(s,fs)          % calculate mel cepstrum with 12 coefs, 256 sample frames
0006 %              (2) c=melcepst(s,fs,'E0dD')   % include log energy, 0th cepstral coef, delta and delta-delta coefs
0007 %
0008 % Inputs:
0009 %     s      speech signal
0010 %     fs  sample rate in Hz (default 11025)
0011 %     w   mode string (see below)
0012 %     nc  number of cepstral coefficients excluding 0'th coefficient [default 12]
0013 %     p   number of filters in filterbank [default: floor(3*log(fs)) =  approx 2.1 per ocatave]
0014 %     n   length of frame in samples [default power of 2 < (0.03*fs)]
0015 %     inc frame increment [default n/2]
0016 %     fl  low end of the lowest filter as a fraction of fs [default = 0]
0017 %     fh  high end of highest filter as a fraction of fs [default = 0.5]
0018 %
0019 %        w   any sensible combination of the following:
0020 %
0021 %               'R'  rectangular window in time domain
0022 %                'N'     Hanning window in time domain
0023 %                'M'     Hamming window in time domain (default)
0024 %
0025 %               't'  triangular shaped filters in mel domain (default)
0026 %               'n'  hanning shaped filters in mel domain
0027 %               'm'  hamming shaped filters in mel domain
0028 %
0029 %                'p'     filters act in the power domain
0030 %                'a'     filters act in the absolute magnitude domain (default)
0031 %
0032 %               '0'  include 0'th order cepstral coefficient
0033 %                'E'  include log energy
0034 %                'd'     include delta coefficients (dc/dt)
0035 %                'D'     include delta-delta coefficients (d^2c/dt^2)
0036 %
0037 %               'z'  highest and lowest filters taper down to zero (default)
0038 %               'y'  lowest filter remains at 1 down to 0 frequency and
0039 %                        highest filter remains at 1 up to nyquist freqency
0040 %
0041 %               If 'ty' or 'ny' is specified, the total power in the fft is preserved.
0042 %
0043 % Outputs:    c     mel cepstrum output: one frame per row. Log energy, if requested, is the
0044 %                 first element of each row followed by the delta and then the delta-delta
0045 %                 coefficients.
0046 %           tc    fractional time in samples at the centre of each frame
0047 %                 with the first sample being 1.
0048 %
0049
0050 % BUGS: (1) should have power limit as 1e-16 rather than 1e-6 (or possibly a better way of choosing this)
0051 %           and put into VOICEBOX
0052 %       (2) get rdct to change the data length (properly) instead of doing it explicitly (wrongly)
0053
0054 %      Copyright (C) Mike Brookes 1997
0055 %      Version: \$Id: melcepst.m 10185 2017-10-04 08:20:32Z dmb \$
0056 %
0057 %   VOICEBOX is a MATLAB toolbox for speech processing.
0059 %
0060 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0061 %   This program is free software; you can redistribute it and/or modify
0063 %   the Free Software Foundation; either version 2 of the License, or
0064 %   (at your option) any later version.
0065 %
0066 %   This program is distributed in the hope that it will be useful,
0067 %   but WITHOUT ANY WARRANTY; without even the implied warranty of
0068 %   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
0069 %   GNU General Public License for more details.
0070 %
0071 %   You can obtain a copy of the GNU General Public License from
0072 %   http://www.gnu.org/copyleft/gpl.html or by writing to
0073 %   Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA.
0074 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0075
0076 if nargin<2 fs=11025; end
0077 if nargin<3 w='M'; end
0078 if nargin<4 nc=12; end
0079 if nargin<5 p=floor(3*log(fs)); end
0080 if nargin<6 n=pow2(floor(log2(0.03*fs))); end
0081 if nargin<9
0082    fh=0.5;
0083    if nargin<8
0084      fl=0;
0085      if nargin<7
0086         inc=floor(n/2);
0087      end
0088   end
0089 end
0090
0091 if isempty(w)
0092    w='M';
0093 end
0094 if any(w=='R')
0095    [z,tc]=enframe(s,n,inc);
0096 elseif any (w=='N')
0097 %    [z,tc]=enframe(s,hanning(n),inc);
0098    [z,tc]=enframe(s,0.5-0.5*cos(2*pi*(1:n)'/(n+1)),inc); % Hanning window
0099 else
0100 %    [z,tc]=enframe(s,hamming(n),inc);
0101    [z,tc]=enframe(s,0.54-0.46*cos(2*pi*(0:n-1)'/(n-1)),inc); % Hamming window
0102 end
0103 f=rfft(z.');
0104 [m,a,b]=melbankm(p,n,fs,fl,fh,w);
0105 pw=f(a:b,:).*conj(f(a:b,:));
0106 pth=max(pw(:))*1E-20;
0107 if any(w=='p')
0108    y=log(max(m*pw,pth));
0109 else
0110    ath=sqrt(pth);
0111    y=log(max(m*abs(f(a:b,:)),ath));
0112 end
0113 c=rdct(y).';
0114 nf=size(c,1);
0115 nc=nc+1;
0116 if p>nc
0117    c(:,nc+1:end)=[];
0118 elseif p<nc
0119    c=[c zeros(nf,nc-p)];
0120 end
0121 if ~any(w=='0')
0122    c(:,1)=[];
0123    nc=nc-1;
0124 end
0125 if any(w=='E')
0126    c=[log(max(sum(pw),pth)).' c];
0127    nc=nc+1;
0128 end
0129
0130 % calculate derivative
0131
0132 if any(w=='D')
0133   vf=(4:-1:-4)/60;
0134   af=(1:-1:-1)/2;
0135   ww=ones(5,1);
0136   cx=[c(ww,:); c; c(nf*ww,:)];
0137   vx=reshape(filter(vf,1,cx(:)),nf+10,nc);
0138   vx(1:8,:)=[];
0139   ax=reshape(filter(af,1,vx(:)),nf+2,nc);
0140   ax(1:2,:)=[];
0141   vx([1 nf+2],:)=[];
0142   if any(w=='d')
0143      c=[c vx ax];
0144   else
0145      c=[c ax];
0146   end
0147 elseif any(w=='d')
0148   vf=(4:-1:-4)/60;
0149   ww=ones(4,1);
0150   cx=[c(ww,:); c; c(nf*ww,:)];
0151   vx=reshape(filter(vf,1,cx(:)),nf+8,nc);
0152   vx(1:8,:)=[];
0153   c=[c vx];
0154 end
0155
0156 if nargout<1
0157    [nf,nc]=size(c);
0158 %    t=((0:nf-1)*inc+(n-1)/2)/fs;
0159    ci=(1:nc)-any(w=='0')-any(w=='E');
0160    imh = imagesc(tc/fs,ci,c.');
0161    axis('xy');
0162    xlabel('Time (s)');
0163    ylabel('Mel-cepstrum coefficient');
0164     map = (0:63)'/63;
0165     colormap([map map map]);
0166     colorbar;
0167 end
0168```

Generated on Mon 06-Aug-2018 14:48:32 by m2html © 2003