At which point does `forfiles` enumerate a directo

2019-07-14 05:44发布

问题:

The command forfiles is intended to enumerate a directory and apply (a) certain command(s) on each item. With the /S the same can be accomplished for a full directory tree.

What happens when the content of the enumerated directory (tree) is changed by the command(s) in the body of the forfiles command?

Supposed we have the directory D:\data with the following content:

file1.txt
file2.txt
file3.txt

The output of forfiles /P "D:\data" /M "*.txt" /C "cmd /C echo @file" when executed in said directory will reflect the above list obviously.

However, what is the output of forfiles when a command in the body modifies the content of the directory? For instance, one of the files in the list is deleted, let's say file3.txt, before it is actually iterated? Or if a new file is created, like file4.txt, before completion of the loop?

How does forfiles /S behave in such a situation? Supposed there are several sub-directories sub1, sub2, sub3, each containing the above list of files; forfiles /S is currently iterating through sub2, sub1 has already been processed, but sub3 not yet; the contents of sub1 and sub3 are changed at that point (when currently walking through sub2 as mentioned); what will be enumerated then? I guess, the change of the content of sub1 won't be recognised, but what about sub3?

I am mainly interested in the behaviour of forfiles since Windows Vista.

Note:
I already posted a very similar question about the for command. However, since forfiles is not a built-in command and has got a completely different syntax I decided to post a separate question instead of extending the scope of the other one.

回答1:

forfiles will fail to continue enumeration of a renamed folder with an ERROR: The system cannot find the file specified. once you try to use a @-variable which resolves to a nonexistent file. There'll be no error for a deleted file and it will see a newly added file if its name follows the currently processed one in currently used order of enumeration (I've tested it with default alphabetic sort in ascending order). So evidently it doesn't build the entire list of files prior to execution of the command but enumerates them one by one after the custom command completes.

Depending on what exactly you need to do with forfiles the reliable solutions would be parsing the output of dir /s /b or of robocopy in a list-only mode. Thus you can ensure the list is generated prior to any change.

  • for /f "delims=" %%a in ('dir "d:\data\*.txt" /s /b') do .......
    Suitable for simple enumerations

  • for /f "tokens=*" %%a in ('robocopy /L /njh /njs /ndl ........') do ...
    Suitable for more complicated scenarios like limiting the date span, may require using additional parsing and /v in non-straightforward cases.



回答2:

I did some tests with forfiles -- here are the results...

Purpose & Scope

The test cases herein are intended to prove whether or not forfiles completes enumeration of a given directory (tree) prior to iteration over all the (sub-)items.

The list below shows what modes are covered by the tests herein:

  • file pattern (/M) is always *.txt;
  • file pattern (/M) matches files only, but not directories;
  • there is always a root search path given (/P);
  • in the body (/C) only internal cmd.exe commands (prefixed with cmd /C) are used;
  • non-recursive operation iterating over a few and also a hundred files;
  • recursive operation (/S) iterating over only a few directories;
  • recursive operation (/S) iterating over directory hierarchy with a depth of one level;
  • file age filter (/D) is not used at all;
  • the directory (tree) contents is modified once during a certain iteration only;
  • files (contents) are not modified, so size and date/time changes are not detected;

Test Setup

All tests are performed on an NTFS format disk. (This might be the reason that all files are enumerated by forfiles in an alphabetic order.)
The operating system is Microsoft Windows 7 64-bit (version 6.1.7601).

Pre-Requisites

The demanded directory trees described in the individual test steps need to be facilitated in advance, prior to execusion of the respective command lines.
There must not be any other files or directories present in the used root directory D:\Data.

forfiles /S, recursive

For the test case herein, the following directory tree has to be established:

D:\Data\
+---sub1\
|       file1.txt
|       file2.txt
|       file3.txt
+---sub2\
|       file1.txt
|       file2.txt
|       file3.txt
+---sub3\
|       file1.txt
|       file2.txt
|       file3.txt
+---sub4\
|       file1.txt
|       file2.txt
|       file3.txt
+---sub5\
        file1.txt
        file2.txt
        file3.txt

I used the following lines of code to set it up:

@(pushd D:\Data
md sub1 & pushd sub1 & rem.> file1.txt & rem.> file2.txt & rem.> file3.txt & del file4.txt & popd
md sub2 & pushd sub2 & rem.> file1.txt & rem.> file2.txt & rem.> file3.txt & del file4.txt & popd
md sub3 & pushd sub3 & rem.> file1.txt & rem.> file2.txt & rem.> file3.txt & del file4.txt & popd
md sub4 & pushd sub4 & rem.> file1.txt & rem.> file2.txt & rem.> file3.txt & del file4.txt & popd
md sub5 & pushd sub5 & rem.> file1.txt & rem.> file2.txt & rem.> file3.txt & del file4.txt & popd
rd /S /Q sub6
popd) > nul 2>&1

My intention is to wait until item file2.txt in folder sub3 is iterated, and then accomplishing the following tasks:

  • in sub3, which is currently iterated over,
    • delete file1.txt (already iterated);
    • delete file3.txt (not yet iterated);
    • create file4.txt (new item, so not yet iterated);
  • delete container sub1 (already iterated);
  • delete container sub4 (not yet iterated);
  • change content of sub2 (already iterated) by renaming file2.txt to file4.txt;
  • change content of sub5 (not yet iterated) by renaming file2.txt to file4.txt;
  • create container sub6 (new item, so not yet iterated), create file4.txt inside;

For all iterated items, the full path is echoed to the command prompt.


If the enumeration is accomplished prior to iterating over all the items, the original directory tree is expected to be output, so none of the modifications should be displayed.

Now let's see what happens; this is the command line to execute:

forfiles /S /P "D:\Data" /M "*.txt" /C "cmd /C (if @relpath==\".\sub3\file2.txt\" (del file1.txt & del file3.txt & rem.> file4.txt & rd /S /Q ..\sub1 & rd /S /Q ..\sub4 & ren ..\sub2\file2.txt file4.txt & ren ..\sub5\file2.txt file4.txt & md ..\sub6 & rem.> ..\sub6\file4.txt)) & echo @path"

The output is the following:

"D:\Data\sub1\file1.txt"
"D:\Data\sub1\file2.txt"
"D:\Data\sub1\file3.txt"
"D:\Data\sub2\file1.txt"
"D:\Data\sub2\file2.txt"
"D:\Data\sub2\file3.txt"
"D:\Data\sub3\file1.txt"
"D:\Data\sub3\file2.txt"
"D:\Data\sub3\file3.txt"
ERROR: The system cannot find the file specified.
"D:\Data\sub5\file1.txt"
"D:\Data\sub5\file3.txt"
"D:\Data\sub5\file4.txt"

As we can clearly see, this is not the original directory tree obviously.
It seems that the directories in the tree are enumerated prior to iteration, but each directory content is enumerated as soon as the iteration arrives there. (This is true at least for the tiny tree at hand; however, it is possible that the directories of a huge tree with a high hierarchy depth might not be fully enumerated prior to iteration.)
The deletion of sub1 and the modification of the content of sub2 are not noticed. As soon as sub4 is reached, an error is returned as during iterating over sub3, sub4 has been deleted. The modification of the content of sub5 is detected. sub6, which is created during iterating over sub3, is not recognised at all.


forfiles, non-recursive

For forfiles without the /S option, a flat directory tree is used:

D:\Data\
    file1.txt
    file2.txt
    file3.txt

This is created using the following code snippet:

@(pushd D:\Data
rem.> file1.txt & rem.> file2.txt & rem.> file3.txt & del file4.txt
popd) > nul 2>&1

For the test, the command line in the forfiles body checks whether the current file is file2.txt; if so, file1.txt and file3.txt are deleted, and new file4.txt is created. The current file is echoed to the command prompt.

The command line to execute is:

forfiles /P "D:\Data" /M "*.txt" /C "cmd /C (if @fname==\"file2\" (del file1.txt & del file3.txt & rem.> file4.txt)) & echo @file"

The output is:

"file1.txt"
"file2.txt"
"file3.txt"

This indicates that the entire directory content has been enumerated prior to iterating over the files.
However, to prove the above assumption, let's perform some more intensive tests.


This time, we use a hundred files:

D:\Data\
    file0.txt
    file1.txt
    file2.txt
    ...
    file99.txt

Those are created with this code:

@(pushd D:\Data
del file100.txt & del file999.txt
for /L %%N in (0,1,99) do (echo.%%N> file%%N.txt)
popd) > nul 2>&1

In this experiment, we rename file99.txt to file999.txt as soon as file1.txt is iterated.

The command line to execute is:

forfiles /P "D:\Data" /M "*.txt" /C "cmd /C (if @fname==\"file1\" (ren file99.txt file999.txt)) & echo @file"

The output is:

"file0.txt"
"file1.txt"
"file10.txt"
"file11.txt"
...
"file98.txt"
"file999.txt"

We receive a list of 100 files which reflects the renaming, meaning that we do not read the original file list. Hence the enumeration is not accomplished prior to the iteration starts.


Here we use the above 100 files again.

In this experiment, we rename file99.txt to file100.txt as soon as file1.txt is iterated.

The command line to execute is:

forfiles /P "D:\Data" /M "*.txt" /C "cmd /C (if @fname==\"file1\" (ren file99.txt file100.txt)) & echo @file"

The output is:

"file0.txt"
"file1.txt"
"file10.txt"
"file11.txt"
...
"file98.txt"

So now we receive a list of only 99 files, without both file99.txt and file100.txt. It seems that the enumeration of the last files is done after file renaming, but file100.txt is not shown though as it would violate the alphabetical order (it should appear after file10.txt, but files near that place seem to have already been enumerated).


Again we use the above 100 files.

In this experiment, we rename file0.txt to file999.txt as soon as file1.txt is iterated.

The command line to execute is:

forfiles /P "D:\Data" /M "*.txt" /C "cmd /C (if @fname==\"file1\" (ren file0.txt file999.txt)) & echo @file"

The output is:

"file0.txt"
"file1.txt"
"file10.txt"
"file11.txt"
...
"file98.txt"
"file99.txt"
"file999.txt"

So now we receive a list of even 101 files, with both file0.txt and file999.txt. It seems that file0.txt has already been enumerated prior to its renaming, but the last files have not yet, so file999.txt also appears in the list.


Conclusion

Evidently, forfiles does not enumerate the entire directory (tree) prior to iterating over all (matching) items.
There seems to be a sort of buffer into which some items are enumerated, and as soon as the iteration needs more data, enumeration continues with the next portion, and so on, until the end is reached.