


WRITEHTK write data in HTK format []=(FILE,D,FP,TC)
Inputs:
FILE = name of file to write (no default extension)
D = data to write: one row per frame
FP = frame period in seconds
TC = type code = the sum of a data type and (optionally) one or more of the listed modifiers
0 WAVEFORM Acoustic waveform
1 LPC Linear prediction coefficients
2 LPREFC LPC Reflection coefficients: -lpcar2rf([1 LPC]);LPREFC(1)=[];
3 LPCEPSTRA LPC Cepstral coefficients
4 LPDELCEP LPC cepstral+delta coefficients (obsolete)
5 IREFC LPC Reflection coefficients (16 bit fixed point)
6 MFCC Mel frequency cepstral coefficients
7 FBANK Log Fliter bank energies
8 MELSPEC linear Mel-scaled spectrum
9 USER User defined features
10 DISCRETE Vector quantised codebook
11 PLP Perceptual Linear prediction
12 ANON
64 _E Includes energy terms hd(1)
128 _N Suppress absolute energy hd(2)
256 _D Include delta coefs hd(3)
512 _A Include acceleration coefs hd(4)
1024 _C Compressed hd(5)
2048 _Z Zero mean static coefs hd(6)
4096 _K CRC checksum (not implemented yet) hd(7) (ignored)
8192 _0 Include 0'th cepstral coef hd(8)
16384 _V Attach VQ index hd(9)
32768 _T Attach delta-delta-delta index hd(10)

0001 function writehtk(file,d,fp,tc) 0002 %WRITEHTK write data in HTK format []=(FILE,D,FP,TC) 0003 % 0004 % Inputs: 0005 % FILE = name of file to write (no default extension) 0006 % D = data to write: one row per frame 0007 % FP = frame period in seconds 0008 % TC = type code = the sum of a data type and (optionally) one or more of the listed modifiers 0009 % 0 WAVEFORM Acoustic waveform 0010 % 1 LPC Linear prediction coefficients 0011 % 2 LPREFC LPC Reflection coefficients: -lpcar2rf([1 LPC]);LPREFC(1)=[]; 0012 % 3 LPCEPSTRA LPC Cepstral coefficients 0013 % 4 LPDELCEP LPC cepstral+delta coefficients (obsolete) 0014 % 5 IREFC LPC Reflection coefficients (16 bit fixed point) 0015 % 6 MFCC Mel frequency cepstral coefficients 0016 % 7 FBANK Log Fliter bank energies 0017 % 8 MELSPEC linear Mel-scaled spectrum 0018 % 9 USER User defined features 0019 % 10 DISCRETE Vector quantised codebook 0020 % 11 PLP Perceptual Linear prediction 0021 % 12 ANON 0022 % 64 _E Includes energy terms hd(1) 0023 % 128 _N Suppress absolute energy hd(2) 0024 % 256 _D Include delta coefs hd(3) 0025 % 512 _A Include acceleration coefs hd(4) 0026 % 1024 _C Compressed hd(5) 0027 % 2048 _Z Zero mean static coefs hd(6) 0028 % 4096 _K CRC checksum (not implemented yet) hd(7) (ignored) 0029 % 8192 _0 Include 0'th cepstral coef hd(8) 0030 % 16384 _V Attach VQ index hd(9) 0031 % 32768 _T Attach delta-delta-delta index hd(10) 0032 0033 % Thanks to Scott Otterson for fixing a bug in writing ultra-long frames. 0034 0035 % Copyright (C) Mike Brookes 2005 0036 % Version: $Id: writehtk.m 713 2011-10-16 14:45:43Z dmb $ 0037 % 0038 % VOICEBOX is a MATLAB toolbox for speech processing. 0039 % Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html 0040 % 0041 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0042 % This program is free software; you can redistribute it and/or modify 0043 % it under the terms of the GNU General Public License as published by 0044 % the Free Software Foundation; either version 2 of the License, or 0045 % (at your option) any later version. 0046 % 0047 % This program is distributed in the hope that it will be useful, 0048 % but WITHOUT ANY WARRANTY; without even the implied warranty of 0049 % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 0050 % GNU General Public License for more details. 0051 % 0052 % You can obtain a copy of the GNU General Public License from 0053 % http://www.gnu.org/copyleft/gpl.html or by writing to 0054 % Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA. 0055 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0056 0057 fid=fopen(file,'w','b'); 0058 if fid < 0 0059 error(sprintf('Cannot write to file %s',file)); 0060 end 0061 tc=bitset(tc,13,0); % silently ignore a checksum request 0062 0063 [nf,nv]=size(d); 0064 nhb=10; % number of suffix codes 0065 ndt=6; % number of bits for base type 0066 hb=floor(tc*pow2(-(ndt+nhb):-ndt)); 0067 hd=hb(nhb+1:-1:2)-2*hb(nhb:-1:1); % extract bits from type code 0068 dt=tc-pow2(hb(end),ndt); % low six bits of tc represent data type 0069 tc=tc-65536*(tc>32767); 0070 0071 if ~dt & (size(d,1)==1) % if waveform is a row vector 0072 d=d(:); % ... convert it to a column vector 0073 [nf,nv]=size(d); 0074 end 0075 0076 if hd(5) % if compressed 0077 dx=max(d,[],1); 0078 dn=min(d,[],1); 0079 a=ones(1,nv); % default compression factors for cols with max=min 0080 b=dx; 0081 mk=dx>dn; 0082 a(mk)=65534./(dx(mk)-dn(mk)); % calculate compression factors for each column 0083 b(mk)=0.5*(dx(mk)+dn(mk)).*a(mk); 0084 d=d.*repmat(a,nf,1)-repmat(b,nf,1); % compress the data 0085 nf=nf+4; % adjust frame count to include compression factors 0086 end 0087 fwrite(fid,nf,'long'); % write frame count 0088 fwrite(fid,round(fp*1.E7),'long'); % write frame period (in 100 ns units) 0089 if any(dt==[0,5,10]) | hd(5) % write data as shorts 0090 if dt==5 % IREFC has fixed scale factor 0091 d=d*32767; 0092 if hd(5) 0093 error('Cannot use compression with IREFC format'); 0094 end 0095 end 0096 nby=nv*2; 0097 if nby<=32767 0098 fwrite(fid,nby,'short'); % write byte count 0099 fwrite(fid,tc,'short'); % write type code 0100 if hd(5) 0101 fwrite(fid,a,'float'); % write compression factors 0102 fwrite(fid,b,'float'); 0103 end 0104 fwrite(fid,d.','short'); % write data array 0105 end 0106 else 0107 nby=nv*4; 0108 if nby<=32767 0109 fwrite(fid,nby,'short'); % write byte count 0110 fwrite(fid,tc,'short'); % write type code 0111 fwrite(fid,d.','float'); % write data array 0112 end 0113 end 0114 fclose(fid); 0115 if nby>32767 0116 delete(file); % remove file if byte count is rubbish 0117 error(sprintf('byte count of frame is %d which exceeds 32767 (is data transposed?)',nby)); 0118 end