TL;DR
In a .gitignore
, patterns with leading or embedded slashes are treated specially. They are different from patterns that have no leading or embedded slash. So you may need **/config/development/*
here because of the two embedded slashes.
Summary
To answer your questions in order:
Yes.
You'll have to ask whoever wrote those ignore files.
As I noted in a comment, the suppositions in Are leading asterisks "**/" redundant in .gitignore path matching syntax? are wrong; the accepted answer there is not applicable to this case.
An explanation of this last item seems appropriate here.
Long
For no obvious reason, Git's rule about whether a .gitignore
pattern matches some file name found during some directory-tree-walk has a peculiar wrinkle. If a pattern does not have an embedded slash, it's treated one way. If it does have an embedded slash, it's treated a different, second way. To really understand this, we need to define a few terms.
As a side note, I tend to like the term directory when talking about the OS-provided entities. If you prefer the term folder, you can substitute that in mentally—they're essentially the same thing. If you're familiar with C or Python or related languages, though, you'll know about opendir
and/or readdir
and/or os.listdir
, and functions like os.walk
, most of which also use the word "directory" to describe these things.
Defining glob patterns
Let's start with the .gitignore
entries, which are made up of extended glob patterns. The term glob pattern is pretty well defined at this Wikipedia page, but we could use a bit more.
The most basic form of glob just has *
, ?
, and [...]
meta-characters. A single question mark matches one character in a file name. An asterisk matches any number of characters (including zero characters), and a square-bracketed string matches any of the characters inside the brackets.1 Note that this kind of simple, basic glob is applied only to files within a single directory. Whatever entity is working with this kind of glob, that entity reads a list of file names—probably from an actual directory—and then selects those names within that directory that match that glob expression.
Obviously, the next level up is to add directories to this kind of simple glob. For instance, we might write dir/*
to mean all files within the directory named dir
. This is not very complicated, though it brings up a question we ignored with the simplest case: does a glob pattern match a directory name? That is, what if dir/sub
is itself a directory—does dir/*
match it? For that matter, does *
match dir
? The typical answer is that, yes, this does match, and as long as we're sticking with dir/*
that just means that dir/sub
gets matched (as a directory).
Extended globs vary a lot. Bash has its own special extended glob syntax, using globstar
to enable **
and extglob
to enable even more. What **
itself means varies: some implementations require it to match at least one directory, but allow any number of directory levels. Other implementations allow **
to match no directories, so that **/sub
matches dir/sub
but also just plain sub
. Git's **
largely behaves this last way: it matches zero or more directories, according to the gitignore documentation.
1Note that despite the resemblance, glob patterns are not at all the same as regular expressions, where typically .
means any single character—the equivalent of ?
in glob—and *
is a suffix operator, meaning zero-or-more of whatever came before. Hence in regular expressions, .*
means zero-or-more of any character. R.E. square brackets usually allow for both ranges and inversion, e.g., [^a-z]
means anything not in a
through z
, while shell glob patterns usually allow only ranges.
Git stores files via Git's index
In an important way, Git doesn't care about directories. In particular, Git commits store files, rather than storing directories full of files. The files simply have path names that look like they occupy directories. The OS demands that the directory dir
exist, so that dir/sub
can exist; dir/sub
in turn must be a directory so that dir/sub/file
can exist. But as far as Git is concerned, Git just needs to store content to go into a file that will be named dir/sub/file
. When it comes time to write that content into that file, Git will simply create dir
and dir/sub
if needed, at that time. The presence or absence of the directories is irrelevant.
This is why you can't store an empty directory in a Git repository: Git stores the contents of files under file names in each commit. With no files, there's nothing to store, so empty directories just are not present in a commit.
Nonetheless, while Git stores only files, Git must use the OS-provided directory reading services in order to find the files you have put into your work-tree. Git will then copy those files—or more precisely, their contents, associated with their (full) names such as dir/sub/file
, into Git's index when you prepare a new commit. The index holds each file's name, mode (100644
or 100755
), and the hash ID of the Git-ified content. That's what will go into the next commit you make. (When you git checkout
some existing commit, Git fills the index from that commit, so that the index initially matches the commit.)
Walking a directory tree
As we just saw, Git has to open and read each directory in your work-tree, starting with the top level of the work-tree itself. The results of calling os.listdir
(Python) or opendir
and readdir
(C) is a list of names: file and sub-directory names within the directory that Git just told the OS to enumerate. A bit more work (calling lstat
) gets the rest of the information required, and now Git knows whether the name dir
refers to an ordinary file, or to a directory.
Given the name of a directory, Git is generally going to have to open and read that directory as well. So Git will open and read dir
and find the name sub
, and discover that sub
is a directory. Git will then open and read dir/sub
and find the name file
, and that file
names a file. This process of opening and reading, recursively, each directory within a directory, is called walking the directory tree. That's what the Python os.walk
function does, for instance.
Standard C does not have a function for walking a tree, so Git implements it all by hand, as it were. This starts to matter in a moment, but for now, think of it this way: by walking the tree, Git finds all the directories and all the files in the repository. Absent .gitignore
, Git throws away all the directory names, keeps all the file names—using their full paths from the top—and then, at least for an "all" add operation, puts all those names and updated contents into the index, ready for the next commit.
There are several things to know about this:
The walking process is inherently recursive. That is, upon finding a directory, we must open and read the directory, handling each entry. If the entry is itself a directory, we must open and read that directory, and so on.
Meanwhile, each entry in a directory is just a name: we—or Git—must assemble the path as we go. If we're working on dir
and come across sub
, the full name is now dir/sub
. If we're working on dir/sub
and come across file
, the full name is now dir/sub/file
. But dir
just lists sub
, and sub
itself just lists file
. It's up to us / Git to remember the path.
The walking process is slow, relatively speaking. Git wants to be fast!
All of these introduce some of the complexities in .gitignore
rules.
Gitignore files may exist at each level and list names and/or glob patterns
At the top level you can have a very simple .gitignore
file:
# ignore files named *.o and *.pyc
*.o
*.pyc
Now Git can walk through your work-tree, finding files in each level of directory. If the file's name—as expressed in that directory, at whatever level—matches any of these simple glob patterns, and the full path name of that file is not already in the index, Git will pretend that the file does not exist: it won't get automatically added, and git status
won't complain about it being untracked.
But what if we want to prevent the file dir/foo
and dir/sub/foo
from going into the top level, while not protecting against foo
in the top level? Then we can tell Git: only ignore foo
when it's contained within in dir
. There is an easy way to express this: create the file dir/.gitignore
. File names listed here are ignored when they're found by reading dir
or any of its sub-directories:
.gitignore:
*.o
*.pyc
dir/.gitignore:
foo
Now, during the walk, when Git opens and reads dir
, it notices that there is a dir/.gitignore
. It applies the rules there to all files found during this recursive traversal: they apply to files in dir
and files in dir/sub
, but not to files in the top level, nor—if there's a top level other/
directory, to files in there either.
Leading and embedded slashes avoid recursive matches
But what if we want to ignore only dir/foo
, not dir/sub/foo
, and not other/foo
or /foo
? Now we have a different problem, and Git provides two solutions. One of them is to write /foo
as the entry in dir/.gitignore
:
.gitignore:
*.o
*.pyc
dir/.gitignore:
/foo
This ignores only dir/foo
, not dir/sub/foo
. It contains a leading slash, which tells Git: Don't apply this to sub-directories.
Another way to express this is to put this right into the top level .gitignore
, which removes the need to have a dir/.gitignore
at all:
*.o
*.pyc
dir/foo
This contains an embedded slash. When Git is doing a directory walk, it naturally finds file names stripped of their paths—it finds foo
, not dir/foo
, inside dir
when walking through dir
. So this kind of pattern is handled separately, after putting together the full path name.
So, this is the source of the first two special rules about slashes in names or patterns in .gitignore
files:
- A leading slash means match only this name or simple glob in this directory.
- An embedded slash means match only this full path name or (extended) glob relative to this directory.
Note that the second case covers the first one: both will work correctly, matching only paths within this directory, once the relative path names are put together (i.e., after sub
's foo
is turned into dir/sub/foo
). But we need the first case because a bare name or glob pattern, such as foo
or *.pyc
, would apply to this directory and all of its sub-directories. We could handle dir/foo
by moving up to the top level and ignoring dir/foo
directly, but if we want to ignore /bar
without ignoring dir/bar
and dir/sub/bar
, we have only the top level .gitignore
for this path.
This means you can invoke the full-path match—well, "full" relative to the directory in which the .gitignore
itself lives—using a leading slash, an embedded slash, or both. In general, if you create the .gitignore
file as close as possible to the file, you'll need the leading slash rule. If you use higher level .gitignore
files, the embedded-slash rule suffices.
(The embedded slash rule might actually be a bug. The wording in the gitignore documentation suggests that dir/sub
is meant to ignore a/dir/sub
as well, and that you would have to write /dir/sub
to not ignore a/dir/sub
. But testing shows that it behaves the way I describe here:
$ git status -s -uall
?? a/dir/sub/file2
?? dir/sub/file
$ echo dir/sub > .gitignore
$ git status -s -uall
?? .gitignore
?? a/dir/sub/file2
$ git --version
git version 2.20.1
Note that ignoring dir/sub
made file
disappear, but a/dir/sub/file2
remains complained-about.)
Trailing slashes are different
Remember that we said that the tree-walk is slow, relatively speaking. It's pretty common to find a Git repository where, in the work-tree, we deliberately add an entire vendor SDK or other packaged thing—maybe taken from the repository as a single tarball, or maybe extracted in some method outside Git entirely—and never want to commit any of the files from inside this packaged thing, whatever it is. Having Git walk through every level of that package, once it's unarchived, is just a waste of time.
To this end, if Git doens't already have an index entry listing, say, dir/sub/vendor/file
, and—during one of its ambles through directory trees—comes across the directory named vendor
in dir/sub
, you can tell Git: Don't bother to look inside this vendor/
directory at all. One way to express this is to use what we already know:
.gitignore:
*.o
*.pyc
dir/sub/vendor
or:
.gitignore:
*.o
*.pyc
dir/sub/.gitignore:
/vendor
We already know what the leading slash is for here: it makes sure we only ignore vendor
in dir/sub
. That's also already the case for the top level .gitignore
.
However, what if we want to skip all directories named vendor
, without skipping any files named vendor
? Here, we can use the trailing slash syntax:
.gitignore:
*.o
*.pyc
vendor/
This vendor/
looks like dir/sub
in some ways. But the slash here is not embedded, it's trailing. So this slash does not turn on the full-path-only code. Instead, it tells Git: During your tree-walk, when you come across something named vendor
, and it's a directory, don't bother recursing into it. The trailing slash is first removed from this string, leaving vendor
is the item to match. That has neither a leading slash, nor an embedded one, so it's matched at any sub-level of this level of the walk—but it does actually have a trailing slash, so it's matched only if what's actually in the tree is a directory.
Of course, we can also just say vendor
, or v*r
, or any other thing that matches vendor
, if we're willing to ignore files as well. Or we can write v*r/
if we want to ignore all directories whose base-name—the part without the full path—matches v*r
.
Un-ignoring a previous ignore rule, and the problem with ignored directories
Any entry in .gitignore
that starts with !
overrides a previous ignore rule that also matched this entry. Note, however, for this to occur, Git needs to have opened and read the directory during its tree-walk. If an earlier ignore rule allows Git to ignore a directory, Git does that during the tree-walk phase.
That is, if there's any rule that matches vendor
at any point, and that rule says do ignore this, and vendor
is a directory, Git won't open vendor
and read its contents. It won't see vendor/file1
, vendor/file2
, and so on. Those names will never be brought under the should we ignore this name microscope, neither in their base-name file1
format, nor in their dir/sub/vendor/file1
full-path format.
Conclusions: what you should know about .gitignore
A leading slash has an anchoring effect. The anchor is at the same level as the .gitignore
file. (If the ignore file is outside the work-tree—e.g., is in $HOME/.gitignore
or .git/info/exclude
—the anchoring level is the top level of the work-tree.)
Embedded slashes—but not a trailing slash—turn on the anchoring effect too, despite the documentation's vague implied hint otherwise. This might be a bug, but Git has behaved this way consistently through many releases (so maybe it's a documentation bug).
Double-star glob matching (**/whatever
) contains an embedded slash, almost by definition. The only two double-star globs that do not have an embedded slash are **/
and **
, neither of which is likely to be used in practice. Embedded slashes anchor names, but the double-star allows zero or more directory levels here, so that the anchoring has no inhibitory effect. The leading double-star is required if you want this kind of free-floating match behavior on a name that, without the leading **/
, would also contain an embedded slash.
Un-ignore rules require that Git open and read a directory. If you want to un-ignore some file deep in the directory tree, you want none of its containing directories to be ignored, or to find that something forces Git to scan the deep subdirectory. That is, if you have a file named long/path/to/important/file
and you want that file to be stored in each commit, you'll need that name to get into Git's index, so that Git will store it in the next commit.
Files that exist in the index are, by definition, not ignored. Ignore entries apply only to files that aren't in the index, but are in the work-tree.
The index (always) exists, and it holds file names that—because the OS insists—actually appear inside directories. So if the index has a long/path/to/important/file
, Git will check to see if long/path/to/important/file
is still there and has or has not been modified. But if you've ignored long
, or long/path/to/important
, or something along the way here, Git won't read the directory.2 If you somehow accidentally remove long/path/to/important/file
from the index while ignoring the directory long/path/to/important
, Git won't add the file back again by itself, nor will it warn you that the work-tree file has become an untracked file.
2You can add a file that would otherwise be ignored using git add -f
, and you can have a set of files in directories that aren't ignored, add some of those files to the index, then modify .gitignore
to ignore their containing directories. All of these result in files in the index that would not have gotten there by a more direct, or less forceful (add -f
), method. These are the cases I consider concerning: they are not wrong but they fall afoul of this last bullet-point.