V_REGEXFILES recursively searches for files matching a pattern tok=(regex,root) Usage: (1) v_regexfiles('\.m$',[],'r') % find all files *.m in current folder tree Inputs: regex regular expression giving the pattern to match (not including root path) root path to initial folder [default: current folder] m 'n' non recursive search [default] 'r' recursive search through tree starting at root Outputs: tok cell array listing the file paths sorted alphabetically (not including the root path) Regular expressions: Each character matches itself except for +?.*^$()[]{}|\ Precede these by \ to avoid their special meanings Use '/' for the separator in file paths . Any single character, including white space [xyz] Any character contained within the brackets: x or y or z [^xyz] Any character not contained within the brackets: anything but x or y or z [x-z] Any character in the range of x through z \s Any white-space character; equivalent to [ \f\n\r\t\v] \S Any non-whitespace character; equivalent to [^ \f\n\r\t\v] \w Any alphabetic, numeric, or underscore character; equivalent to [a-zA-Z_0-9] \W Any character that is not alphabetic, numeric, or underscore; equivalent to [^a-zA-Z_0-9] \d Any numeric digit; equivalent to [0-9] \D Any nondigit character; equivalent to [^0-9] \oN or \o{N} Character of octal value N \xN or \x{N} Character of hexadecimal value N (...) Group cat|dog Alternatives 'cat' or 'dog' ^ match start of full file name (if first character) $ match end of full file name (if last character) \< match start of word \> match end of word ? match preceeding character 0 or 1 times * match preceeding character or group >=0 times + match preceeding character or group >=1 times {m} match preceeding character or group exactly m times {m,} match preceeding character or group >=m times {m,n} match preceeding character or group >=m and <=n times
0001 function tok=v_regexfiles(regex,root,m) 0002 %V_REGEXFILES recursively searches for files matching a pattern tok=(regex,root) 0003 % 0004 % Usage: (1) v_regexfiles('\.m$',[],'r') % find all files *.m in current folder tree 0005 % 0006 % Inputs: 0007 % regex regular expression giving the pattern to match (not including root path) 0008 % root path to initial folder [default: current folder] 0009 % m 'n' non recursive search [default] 0010 % 'r' recursive search through tree starting at root 0011 % 0012 % Outputs: 0013 % tok cell array listing the file paths sorted alphabetically (not including the root path) 0014 % 0015 % Regular expressions: 0016 % Each character matches itself except for +?.*^$()[]{}|\ 0017 % Precede these by \ to avoid their special meanings 0018 % Use '/' for the separator in file paths 0019 % . Any single character, including white space 0020 % [xyz] Any character contained within the brackets: x or y or z 0021 % [^xyz] Any character not contained within the brackets: anything but x or y or z 0022 % [x-z] Any character in the range of x through z 0023 % \s Any white-space character; equivalent to [ \f\n\r\t\v] 0024 % \S Any non-whitespace character; equivalent to [^ \f\n\r\t\v] 0025 % \w Any alphabetic, numeric, or underscore character; equivalent to [a-zA-Z_0-9] 0026 % \W Any character that is not alphabetic, numeric, or underscore; equivalent to [^a-zA-Z_0-9] 0027 % \d Any numeric digit; equivalent to [0-9] 0028 % \D Any nondigit character; equivalent to [^0-9] 0029 % \oN or \o{N} Character of octal value N 0030 % \xN or \x{N} Character of hexadecimal value N 0031 % 0032 % (...) Group 0033 % cat|dog Alternatives 'cat' or 'dog' 0034 % ^ match start of full file name (if first character) 0035 % $ match end of full file name (if last character) 0036 % \< match start of word 0037 % \> match end of word 0038 % ? match preceeding character 0 or 1 times 0039 % * match preceeding character or group >=0 times 0040 % + match preceeding character or group >=1 times 0041 % {m} match preceeding character or group exactly m times 0042 % {m,} match preceeding character or group >=m times 0043 % {m,n} match preceeding character or group >=m and <=n times 0044 0045 % Copyright (C) Mike Brookes 2010 0046 % Version: $Id: v_regexfiles.m 10865 2018-09-21 17:22:45Z dmb $ 0047 % 0048 % VOICEBOX is a MATLAB toolbox for speech processing. 0049 % Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html 0050 % 0051 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0052 % This program is free software; you can redistribute it and/or modify 0053 % it under the terms of the GNU General Public License as published by 0054 % the Free Software Foundation; either version 2 of the License, or 0055 % (at your option) any later version. 0056 % 0057 % This program is distributed in the hope that it will be useful, 0058 % but WITHOUT ANY WARRANTY; without even the implied warranty of 0059 % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 0060 % GNU General Public License for more details. 0061 % 0062 % You can obtain a copy of the GNU General Public License from 0063 % http://www.gnu.org/copyleft/gpl.html or by writing to 0064 % Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA. 0065 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0066 % root='Z:/dmb/data\speech/timit/TRAIN/DR1'; 0067 % regex='^FD.*\.wav$'; 0068 if nargin<3 || isempty(m) 0069 m='n'; 0070 end 0071 if nargin<2 || isempty(root) 0072 root='./'; 0073 end 0074 if isempty(regex) 0075 regex='.*'; 0076 end 0077 root(root=='\')='/'; % use forward slash everywhere 0078 if ~isempty(root) && root(end)=='/' 0079 root(end)=[]; % remove a final '/' 0080 end 0081 dirlist{1}=''; % list of sub directories to process (e.g. '/xx/yy') 0082 ntok=0; 0083 tok=cell(0); 0084 rec=any(m=='r'); % recursive search 0085 while ~isempty(dirlist) 0086 dd=dir([root dirlist{1}]); 0087 for i=1:length(dd) 0088 name=dd(i).name; 0089 full=[dirlist{1} '/' name]; 0090 if dd(i).isdir 0091 if rec && name(1)~='.' % ignore directories starting with '.' 0092 dirlist{end+1}=full; 0093 end 0094 else 0095 full(1)=[]; % remove leading '/' 0096 if ~isempty(regexpi(full,regex)) 0097 ntok=ntok+1; 0098 tok{ntok,1}=full; 0099 end 0100 end 0101 end 0102 dirlist(1)=[]; % remove this directory from the list 0103 end 0104 tok=sort(tok);