Recent versions of Ruby support the use of braces in globbing, if you use the File::FNM_EXTGLOB option
From the 2.2.0 documentation
File.fnmatch('c{at,ub}s', 'cats', File::FNM_EXTGLOB) #=> true # { } is supported on FNM_EXTGLOB
However, the 1.9.3 documentation says it isn't supported in 1.9.3:
File.fnmatch('c{at,ub}s', 'cats') #=> false # { } isn't supported
(also, trying to use File::FNM_EXTGLOB
gave a name error)
Is there any way to glob using braces in Ruby 1.9.3, such as a third-party gem?
The strings I want to match against are from S3, not a local file system, so I can't just ask the operating system to do the globbing as far as I know.
I'm in the process of packaging up a Ruby Backport for braces globbing support. Here are the essential parts of that solution:
module File::Constants
FNM_EXTGLOB = 0x10
end
class << File
def fnmatch_with_braces_glob(pattern, path, flags =0)
regex = glob_convert(pattern, flags)
return regex && path.match(regex).to_s == path
end
def fnmatch_with_braces_glob?(pattern, path, flags =0)
return fnmatch_with_braces_glob(pattern, path, flags)
end
private
def glob_convert(pattern, flags)
brace_exp = (flags & File::FNM_EXTGLOB) != 0
pathnames = (flags & File::FNM_PATHNAME) != 0
dot_match = (flags & File::FNM_DOTMATCH) != 0
no_escape = (flags & File::FNM_NOESCAPE) != 0
casefold = (flags & File::FNM_CASEFOLD) != 0
syscase = (flags & File::FNM_SYSCASE) != 0
special_chars = ".*?\\[\\]{},.+()|$^\\\\" + (pathnames ? "/" : "")
special_chars_regex = Regexp.new("[#{special_chars}]")
if pattern.length == 0 || !pattern.index(special_chars_regex)
return Regexp.new(pattern, casefold || syscase ? Regexp::IGNORECASE : 0)
end
# Convert glob to regexp and escape regexp characters
length = pattern.length
start = 0
brace_depth = 0
new_pattern = ""
char = "/"
loop do
path_start = !dot_match && char[-1] == "/"
index = pattern.index(special_chars_regex, start)
if index
new_pattern += pattern[start...index] if index > start
char = pattern[index]
snippet = case char
when "?" then path_start ? (pathnames ? "[^./]" : "[^.]") : ( pathnames ? "[^/]" : ".")
when "." then "\\."
when "{" then (brace_exp && (brace_depth += 1) >= 1) ? "(?:" : "{"
when "}" then (brace_exp && (brace_depth -= 1) >= 0) ? ")" : "}"
when "," then (brace_exp && brace_depth >= 0) ? "|" : ","
when "/" then "/"
when "\\"
if !no_escape && index < length
next_char = pattern[index += 1]
special_chars.include?(next_char) ? "\\#{next_char}" : next_char
else
"\\\\"
end
when "*"
if index+1 < length && pattern[index+1] == "*"
char += "*"
if pathnames && index+2 < length && pattern[index+2] == "/"
char += "/"
index += 2
"(?:(?:#{path_start ? '[^.]' : ''}[^\/]*?\\#{File::SEPARATOR})(?:#{!dot_match ? '[^.]' : ''}[^\/]*?\\#{File::SEPARATOR})*?)?"
else
index += 1
"(?:#{path_start ? '[^.]' : ''}(?:[^\\#{File::SEPARATOR}]*?\\#{File::SEPARATOR}?)*?)?"
end
else
path_start ? (pathnames ? "(?:[^./][^/]*?)?" : "(?:[^.].*?)?") : (pathnames ? "[^/]*?" : ".*?")
end
when "["
# Handle character set inclusion / exclusion
start_index = index
end_index = pattern.index(']', start_index+1)
while end_index && pattern[end_index-1] == "\\"
end_index = pattern.index(']', end_index+1)
end
if end_index
index = end_index
char_set = pattern[start_index..end_index]
char_set.delete!('/') if pathnames
char_set[1] = '^' if char_set[1] == '!'
(char_set == "[]" || char_set == "[^]") ? "" : char_set
else
"\\["
end
else
"\\#{char}"
end
new_pattern += snippet
else
if start < length
snippet = pattern[start..-1]
new_pattern += snippet
end
end
break if !index
start = index + 1
end
begin
return Regexp.new("\\A#{new_pattern}\\z", casefold || syscase ? Regexp::IGNORECASE : 0)
rescue
return nil
end
end
end
This solution takes into account the various flags available for the File::fnmatch function, and uses the glob pattern to build a suitable Regexp to match the features. With this solution, these tests can be run successfully:
File.fnmatch('c{at,ub}s', 'cats', File::FNM_EXTGLOB)
#=> true
File.fnmatch('file{*.doc,*.pdf}', 'filename.doc')
#=> false
File.fnmatch('file{*.doc,*.pdf}', 'filename.doc', File::FNM_EXTGLOB)
#=> true
File.fnmatch('f*l?{[a-z].doc,[0-9].pdf}', 'filex.doc', File::FNM_EXTGLOB)
#=> true
File.fnmatch('**/.{pro,}f?l*', 'home/.profile', File::FNM_EXTGLOB | File::FNM_DOTMATCH)
#=> true
The fnmatch_with_braces_glob
(and ?
variant) will be patched in place of fnmatch
, so that Ruby 2.0.0-compliant code will work with earlier Ruby versions, as well. For clarity reasons, the code shown above does not include some performance improvements, argument checking, or the Backports feature detection and patch-in code; these will obviously be included in the actual submission to the project.
I'm still testing some edge cases and heavily optimizing performance; it should be ready to submit very soon. Once it's available in an official Backports release, I'll update the status here.
Note that Dir::glob support will be coming at the same time, as well.
That was a fun Ruby exercise!
No idea if this solution is robust enough for you, but here goes :
class File
class << self
def fnmatch_extglob(pattern, path, flags=0)
explode_extglob(pattern).any?{|exploded_pattern|
fnmatch(exploded_pattern,path,flags)
}
end
def explode_extglob(pattern)
if match=pattern.match(/\{([^{}]+)}/) then
subpatterns = match[1].split(',',-1)
subpatterns.map{|subpattern| explode_extglob(match.pre_match+subpattern+match.post_match)}.flatten
else
[pattern]
end
end
end
end
Better testing is needed, but it seems to work fine for simple cases :
[2] pry(main)> File.explode_extglob('c{at,ub}s')
=> ["cats", "cubs"]
[3] pry(main)> File.explode_extglob('c{at,ub}{s,}')
=> ["cats", "cat", "cubs", "cub"]
[4] pry(main)> File.explode_extglob('{a,b,c}{d,e,f}{g,h,i}')
=> ["adg", "adh", "adi", "aeg", "aeh", "aei", "afg", "afh", "afi", "bdg", "bdh", "bdi", "beg", "beh", "bei", "bfg", "bfh", "bfi", "cdg", "cdh", "cdi", "ceg", "ceh", "cei", "cfg", "cfh", "cfi"]
[5] pry(main)> File.explode_extglob('{a,b}c*')
=> ["ac*", "bc*"]
[6] pry(main)> File.fnmatch('c{at,ub}s', 'cats')
=> false
[7] pry(main)> File.fnmatch_extglob('c{at,ub}s', 'cats')
=> true
[8] pry(main)> File.fnmatch_extglob('c{at,ub}s*', 'catsssss')
=> true
Tested with Ruby 1.9.3 and Ruby 2.1.5 and 2.2.1.