V_WRITEHTK write data in HTK format []=(FILE,D,FP,TC) Inputs: FILE = name of file to write (no default extension) D = data to write: one row per frame FP = frame period in seconds TC = type code = the sum of a data type and (optionally) one or more of the listed modifiers 0 WAVEFORM Acoustic waveform 1 LPC Linear prediction coefficients 2 LPREFC LPC Reflection coefficients: -v_lpcar2rf([1 LPC]);LPREFC(1)=[]; 3 LPCEPSTRA LPC Cepstral coefficients 4 LPDELCEP LPC cepstral+delta coefficients (obsolete) 5 IREFC LPC Reflection coefficients (16 bit fixed point) 6 MFCC Mel frequency cepstral coefficients 7 FBANK Log Fliter bank energies 8 MELSPEC linear Mel-scaled spectrum 9 USER User defined features 10 DISCRETE Vector quantised codebook 11 PLP Perceptual Linear prediction 12 ANON 64 _E Includes energy terms hd(1) 128 _N Suppress absolute energy hd(2) 256 _D Include delta coefs hd(3) 512 _A Include acceleration coefs hd(4) 1024 _C Compressed hd(5) 2048 _Z Zero mean static coefs hd(6) 4096 _K CRC checksum (not implemented yet) hd(7) (ignored) 8192 _0 Include 0'th cepstral coef hd(8) 16384 _V Attach VQ index hd(9) 32768 _T Attach delta-delta-delta index hd(10)
0001 function v_writehtk(file,d,fp,tc) 0002 %V_WRITEHTK write data in HTK format []=(FILE,D,FP,TC) 0003 % 0004 % Inputs: 0005 % FILE = name of file to write (no default extension) 0006 % D = data to write: one row per frame 0007 % FP = frame period in seconds 0008 % TC = type code = the sum of a data type and (optionally) one or more of the listed modifiers 0009 % 0 WAVEFORM Acoustic waveform 0010 % 1 LPC Linear prediction coefficients 0011 % 2 LPREFC LPC Reflection coefficients: -v_lpcar2rf([1 LPC]);LPREFC(1)=[]; 0012 % 3 LPCEPSTRA LPC Cepstral coefficients 0013 % 4 LPDELCEP LPC cepstral+delta coefficients (obsolete) 0014 % 5 IREFC LPC Reflection coefficients (16 bit fixed point) 0015 % 6 MFCC Mel frequency cepstral coefficients 0016 % 7 FBANK Log Fliter bank energies 0017 % 8 MELSPEC linear Mel-scaled spectrum 0018 % 9 USER User defined features 0019 % 10 DISCRETE Vector quantised codebook 0020 % 11 PLP Perceptual Linear prediction 0021 % 12 ANON 0022 % 64 _E Includes energy terms hd(1) 0023 % 128 _N Suppress absolute energy hd(2) 0024 % 256 _D Include delta coefs hd(3) 0025 % 512 _A Include acceleration coefs hd(4) 0026 % 1024 _C Compressed hd(5) 0027 % 2048 _Z Zero mean static coefs hd(6) 0028 % 4096 _K CRC checksum (not implemented yet) hd(7) (ignored) 0029 % 8192 _0 Include 0'th cepstral coef hd(8) 0030 % 16384 _V Attach VQ index hd(9) 0031 % 32768 _T Attach delta-delta-delta index hd(10) 0032 0033 % Thanks to Scott Otterson for fixing a bug in writing ultra-long frames 0034 % and to Mauricio Villegas for fixing the type specifications. 0035 0036 % Copyright (C) Mike Brookes 2005-2015 0037 % Version: $Id: v_writehtk.m 10865 2018-09-21 17:22:45Z dmb $ 0038 % 0039 % VOICEBOX is a MATLAB toolbox for speech processing. 0040 % Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html 0041 % 0042 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0043 % This program is free software; you can redistribute it and/or modify 0044 % it under the terms of the GNU General Public License as published by 0045 % the Free Software Foundation; either version 2 of the License, or 0046 % (at your option) any later version. 0047 % 0048 % This program is distributed in the hope that it will be useful, 0049 % but WITHOUT ANY WARRANTY; without even the implied warranty of 0050 % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 0051 % GNU General Public License for more details. 0052 % 0053 % You can obtain a copy of the GNU General Public License from 0054 % http://www.gnu.org/copyleft/gpl.html or by writing to 0055 % Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA. 0056 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0057 0058 fid=fopen(file,'w','b'); 0059 if fid < 0 0060 error('Cannot write to file %s',file); 0061 end 0062 tc=bitset(tc,13,0); % silently ignore a checksum request 0063 0064 [nf,nv]=size(d); 0065 nhb=10; % number of suffix codes 0066 ndt=6; % number of bits for base type 0067 hb=floor(tc*pow2(-(ndt+nhb):-ndt)); 0068 hd=hb(nhb+1:-1:2)-2*hb(nhb:-1:1); % extract bits from type code 0069 dt=tc-pow2(hb(end),ndt); % low six bits of tc represent data type 0070 tc=tc-65536*(tc>32767); 0071 0072 if ~dt && (size(d,1)==1) % if waveform is a row vector 0073 d=d(:); % ... convert it to a column vector 0074 [nf,nv]=size(d); 0075 end 0076 0077 if hd(5) % if compressed 0078 dx=max(d,[],1); 0079 dn=min(d,[],1); 0080 a=ones(1,nv); % default compression factors for cols with max=min 0081 b=dx; 0082 mk=dx>dn; 0083 a(mk)=65534./(dx(mk)-dn(mk)); % calculate compression factors for each column 0084 b(mk)=0.5*(dx(mk)+dn(mk)).*a(mk); 0085 d=d.*repmat(a,nf,1)-repmat(b,nf,1); % compress the data 0086 nf=nf+4; % adjust frame count to include compression factors 0087 end 0088 fwrite(fid,nf,'int32'); % write frame count 0089 fwrite(fid,round(fp*1.E7),'int32'); % write frame period (in 100 ns units) 0090 if any(dt==[0,5,10]) || hd(5) % write data as shorts 0091 if dt==5 % IREFC has fixed scale factor 0092 d=d*32767; 0093 if hd(5) 0094 error('Cannot use compression with IREFC format'); 0095 end 0096 end 0097 nby=nv*2; 0098 if nby<=32767 0099 fwrite(fid,nby,'int16'); % write byte count 0100 fwrite(fid,tc,'int16'); % write type code 0101 if hd(5) 0102 fwrite(fid,a,'float32'); % write compression factors 0103 fwrite(fid,b,'float32'); 0104 end 0105 fwrite(fid,d.','int16'); % write data array 0106 end 0107 else 0108 nby=nv*4; 0109 if nby<=32767 0110 fwrite(fid,nby,'int16'); % write byte count 0111 fwrite(fid,tc,'int16'); % write type code 0112 fwrite(fid,d.','float32'); % write data array 0113 end 0114 end 0115 fclose(fid); 0116 if nby>32767 0117 delete(file); % remove file if byte count is rubbish 0118 error('byte count of frame is %d which exceeds 32767 (is data transposed?)',nby); 0119 end