I discovered this quite by accident while looking for a file with a number in the name. When I type:
dir
*
number*
(where number represents any number from 0 to 9 and with no spaces between the asterisks and the number)
at the cmd.exe command prompt, it returns various files do not appear in any to fit the search criteria. What's weird, is that depending on the directory, some numbers will work and not others. An example is, in a directory associated with a website, I type the following:
dir *4*
and what is returned is:
Directory of C:\Ampps\www\includes\pages 04/30/2012 03:55 PM 153 inventory_list_retrieve.php 06/18/2012 11:17 AM 6,756 ix.html 06/19/2012 01:47 PM 257,501 jquery.1.7.1.js 3 File(s) 264,410 bytes 0 Dir(s) 362,280,906,752 bytes free
That just doesn't make any sense to me. Any clue?
The question is posed on stackOverflow because the DIR command is often combined with FOR in batch programs. The strange DIR behavior would seem to make batch programs potentially unreliable if they use the DIR command.
Edit: (additional note). Though much time has passed, I discovered another quirk with this that almost cost me a lot of work. I wanted to delete all .htm
files in a particular directory tree. I realized just before doing it that *.htm
matches .html
files as well. Also, *.man
matches .manifest
, and there are probably others. Deleting all .html
files in that particular directory would have been upsetting to say the least.
Seems like dir command searches also short ( 8.3 manner ) file names under the hood.
When I call
dir *1*
this is what I get:There is a
gDEBugger-5_8.msi
file amongst listed ones, which apparently does not have any1
character in it.Everything becomes clear when I use
/X
switch with the dir command, which makes dir use 8.3 file names. Output from adir /X *1*
command:Quote from dir's help:
Yep. You'll see that it also searches through short names if you try this:
(/x switch is for short names)
for filtering file names use :
A quote from RBerteig's answer:
The above is true even for the FOR command, which is very nasty.
will also search the short names. The solution again would be to use FIND or FINDSTR to filter out the names in a more reliable manner.
Note - change %A to %%A if using the command within a batch file.
Combining FOR with FINDSTR can be a general purpose method to safely use any command that runs into problems with short file names. Simply replace ECHO with the problem command such as COPY or DEL.
Wild cards at the command prompt are matched against both the long file name and the short "8.3" name if one is present. This can produce surprises.
To see the short names, use the
/X
option to theDIR
command.Note that this behavior is not in any way specific to the
DIR
command, and can lead to other (often unpleasant) surprises when a wild card matches more than expected on any command, such asDEL
.Unlike in *nix shells, replacement of a file pattern with the list of matching names is implemented within each command and not implemented by the shell itself. This can mean that different commands could implement different wild card pattern rules, but in practice this is quite rare as Windows provides API calls to search a directory for files that match a pattern and most programs use those calls in the obvious way. For programs written in C or C++ using the "usual" tools, that expansion is provided "for free" by the C runtime library, using the Windows API.
The Windows API in question is
FindFirstFile()
and its close relativesFindFirstFileEx()
,FindNextFile()
, andFindClose()
.Oddly, although the documentation for
FindFirstFile()
describes its lpFileName parameter as "directory or path, and the file name, which can include wildcard characters, for example, an asterisk (*
) or a question mark (?
)" it never actually defines what the*
and?
characters mean.The exact meaning of the file pattern has history in the CP/M operating system dating from the early 1970s that strongly influenced (some might say "was directly copied" in place of "influenced" here) the design of MSDOS. This has resulted in a number of "interesting" artifacts and behaviors. Some of this at the DOS end of the spectrum is described at this blog post from 2007 where Raymond describes exactly how file patters were implemented in DOS.