Home > voicebox > regexfiles.m

regexfiles

PURPOSE ^

REGEXFILES recursively searches for files matching a pattern tok=(regex,root)

SYNOPSIS ^

function tok=regexfiles(regex,root)

DESCRIPTION ^

REGEXFILES recursively searches for files matching a pattern tok=(regex,root)

 Usage:  (1) regexfiles('\.m$')      % find all files *.m in current folder tree

 Inputs:
         regex  regular expression giving the pattern to match (not including root path)
         root   path to initial folder [default: current folder]

 Outputs:
          tok   cell array listing the file paths (not including the root)

 Regular expressions:
    Each character matches itself except for +?.*^$()[]{}|\
    Precede these by \ to avoid their special meanings
    Use '/' for the separator in file paths
         .       Any single character, including white space
         [xyz]   Any character contained within the brackets: x or y or z
         [^xyz]  Any character not contained within the brackets: anything but x or y or z
         [x-z]   Any character in the range of x through z
         \s      Any white-space character; equivalent to [ \f\n\r\t\v]
         \S      Any non-whitespace character; equivalent to [^ \f\n\r\t\v]
         \w      Any alphabetic, numeric, or underscore character; equivalent to [a-zA-Z_0-9]
         \W      Any character that is not alphabetic, numeric, or underscore; equivalent to [^a-zA-Z_0-9]
         \d      Any numeric digit; equivalent to [0-9]
         \D      Any nondigit character; equivalent to [^0-9]
         \oN or \o{N}  Character of octal value N
         \xN or \x{N}  Character of hexadecimal value N

         (...)   Group 
         cat|dog Alternatives 'cat' or 'dog'
         ^       match start of full file name (if first character)
         $       match end of full file name (if last character)
         \<      match start of word
         \>      match end of word
         ?       match preceeding character 0 or 1 times
         *       match preceeding character or group >=0 times
         +       match preceeding character or group >=1 times
         {m}     match preceeding character or group exactly m times
         {m,}    match preceeding character or group >=m  times
         {m,n}   match preceeding character or group >=m and <=n times

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SOURCE CODE ^

0001 function tok=regexfiles(regex,root)
0002 %REGEXFILES recursively searches for files matching a pattern tok=(regex,root)
0003 %
0004 % Usage:  (1) regexfiles('\.m$')      % find all files *.m in current folder tree
0005 %
0006 % Inputs:
0007 %         regex  regular expression giving the pattern to match (not including root path)
0008 %         root   path to initial folder [default: current folder]
0009 %
0010 % Outputs:
0011 %          tok   cell array listing the file paths (not including the root)
0012 %
0013 % Regular expressions:
0014 %    Each character matches itself except for +?.*^$()[]{}|\
0015 %    Precede these by \ to avoid their special meanings
0016 %    Use '/' for the separator in file paths
0017 %         .       Any single character, including white space
0018 %         [xyz]   Any character contained within the brackets: x or y or z
0019 %         [^xyz]  Any character not contained within the brackets: anything but x or y or z
0020 %         [x-z]   Any character in the range of x through z
0021 %         \s      Any white-space character; equivalent to [ \f\n\r\t\v]
0022 %         \S      Any non-whitespace character; equivalent to [^ \f\n\r\t\v]
0023 %         \w      Any alphabetic, numeric, or underscore character; equivalent to [a-zA-Z_0-9]
0024 %         \W      Any character that is not alphabetic, numeric, or underscore; equivalent to [^a-zA-Z_0-9]
0025 %         \d      Any numeric digit; equivalent to [0-9]
0026 %         \D      Any nondigit character; equivalent to [^0-9]
0027 %         \oN or \o{N}  Character of octal value N
0028 %         \xN or \x{N}  Character of hexadecimal value N
0029 %
0030 %         (...)   Group
0031 %         cat|dog Alternatives 'cat' or 'dog'
0032 %         ^       match start of full file name (if first character)
0033 %         $       match end of full file name (if last character)
0034 %         \<      match start of word
0035 %         \>      match end of word
0036 %         ?       match preceeding character 0 or 1 times
0037 %         *       match preceeding character or group >=0 times
0038 %         +       match preceeding character or group >=1 times
0039 %         {m}     match preceeding character or group exactly m times
0040 %         {m,}    match preceeding character or group >=m  times
0041 %         {m,n}   match preceeding character or group >=m and <=n times
0042 
0043 %      Copyright (C) Mike Brookes 2010
0044 %      Version: $Id: regexfiles.m 10141 2017-09-27 09:31:04Z dmb $
0045 %
0046 %   VOICEBOX is a MATLAB toolbox for speech processing.
0047 %   Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html
0048 %
0049 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0050 %   This program is free software; you can redistribute it and/or modify
0051 %   it under the terms of the GNU General Public License as published by
0052 %   the Free Software Foundation; either version 2 of the License, or
0053 %   (at your option) any later version.
0054 %
0055 %   This program is distributed in the hope that it will be useful,
0056 %   but WITHOUT ANY WARRANTY; without even the implied warranty of
0057 %   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
0058 %   GNU General Public License for more details.
0059 %
0060 %   You can obtain a copy of the GNU General Public License from
0061 %   http://www.gnu.org/copyleft/gpl.html or by writing to
0062 %   Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA.
0063 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0064 % root='Z:/dmb/data\speech/timit/TRAIN/DR1';
0065 % regex='^FD.*\.wav$';
0066 if nargin<2 || isempty(root)
0067     root='./';
0068 end
0069 if isempty(regex)
0070     regex='.*';
0071 end
0072 root(root=='\')='/'; % use forward slash everywhere
0073 if ~isempty(root) && root(end)=='/'
0074     root(end)=[]; % remove a final '/'
0075 end
0076 dirlist{1}=''; % list of sub directories to process (e.g. '/xx/yy')
0077 ntok=0;
0078 tok=cell(0);
0079 while ~isempty(dirlist)
0080     dd=dir([root dirlist{1}]);
0081     for i=1:length(dd)
0082         name=dd(i).name;
0083         full=[dirlist{1} '/' name];
0084         if dd(i).isdir
0085             if name(1)~='.'   % ignore directories starting with '.'
0086                 dirlist{end+1}=full;
0087             end
0088         else
0089             full(1)=[]; % remove leading '/'
0090             if ~isempty(regexpi(full,regex));
0091                 ntok=ntok+1;
0092                 tok{ntok,1}=full;
0093             end
0094         end
0095     end
0096     dirlist(1)=[];  % remove this directory from the list
0097 end

Generated on Tue 10-Oct-2017 08:30:10 by m2html © 2003