How to copy certain files (w/o folder hierarchy),

2019-01-17 01:39发布

问题:

I need to copy all *.doc files (but not folders whose names match *.doc) from a network folder \\server\source (including files in all nested folders) to a local folder C:\destination without preserving the nested folders hierarchy (i.e. all files should go directly into C:\destination and no nested folders should be created in C:\destination). In case there are several files with the same name from different subfolders of \\server\source, only the first one should be copied and never overwritten then — all conflicting files found later should be skipped (there could be many cases like this, and the skipped files should not be trasferred over the network, otherwise it would take too much time). Here is my attempt to implement it in PowerShell:

cp \\server\source\* -Recurse -Include *.doc -Container:$false -Destination C:\destination

There are at least two problems with this command:

  • It copies folders whose names match *.doc too.
  • In case of conflicting names any file found later is transferred over the network and overwrites the previous one.

Can you suggest how to fix these problems?
Implementations using copy, xcopy, robocopy, cscript or *.bat, *.cmd are also welcome.
The local OS is Windows 8 and the file system is NTFS.

回答1:

I would produce the list of files first and validate as you go through the list.

Something like this:

$srcdir = "\\server\source\";
$destdir = "C:\destination\";
$files = (Get-ChildItem $SrcDir -recurse -filter *.doc | where-object {-not ($_.PSIsContainer)});
$files|foreach($_){
    if (!([system.io.file]::Exists($destdir+$_.name))){
                cp $_.Fullname ($destdir+$_.name)
    };
}

So, use Get-ChildItem to list files in source folder matching the filter, pipe through where-object to strip directories out.

Then go through each file in a foreach loop and check if the filename (not Fullname) exists in the destination using the Exists method of the system.io.file .NET class.

If it doesn't, copy, using only original filename (dropping original path).

Use the -whatif option on the copy when testing, so it only displays what it would do, in case result is not what you wanted :-)



回答2:

The previous answers seem rather overcomplicated to me, unless I'm misunderstanding something. This should work:

Get-ChildItem "\\server\source\" *.doc -Recurse | ?{-not ($_.PSIsContainer -or (Test-Path "C:\Destination\$_"))} | Copy-Item -Destination "C:\Destination"

None of the built-in commands - copy, xcopy, or robocopy - will do what you want on their own, but there's a utility called xxcopy that will, conveniently available at http://www.xxcopy.com. It has a number of built-in options specifically for flattening directory trees into a single directory. The following will do what you described:

xxcopy "\\server\source\*.doc" "C:\Destination" /SGFO

However, xxcopy has various other options for handling duplicate filenames than just copying the first one encountered, such as adding the source directory name to the filename, or adding sequential numerical identifies to all but the first one, or all but the newest or oldest. See this page for details: http://www.xxcopy.com/xxcopy16.htm



回答3:

# Get all *.doc files under \\server\source
Get-ChildItem -Path \\server\source *.doc -Recurse |
    # Filter out directores
    Where-Object { -not $_.PsIsContainer } | 
    # Add property for destination
    Add-Member ScriptProperty -Name Destination -Value { Join-Path 'C:\destination' $this.Name } -PassThru |
    # Filter out files that exist on the destination
    Where-Object { -not (Test-Path -Path $_.Destination -PathType Leaf } | 
    # Copy. 
    Copy-Item


回答4:

Why use foreach when you already have a pipeline? Calculated properties for the win!

Get-ChildItem -Recurse -Path:\\Server\Path -filter:'*.doc' | 
    Where { -not $_.PSIsContainer } |
    Group Name |
    Select @{Name='Path'; Expression={$_.Group[0].FullName}},@{Name='Destination'; Expression={'C:\Destination\{0}' -f $_.Name}} |
    Copy-Item


回答5:

$docFiles = Get-ChildItem -Path "\\server\source" -Recurse | Where-Object {$_.Attributes.ToString() -notlike "*Directory*" -and ($_.Name -like "*.doc" -or $_.Name -like "*.doc?")} | Sort-Object -Unique;
$docFiles | ForEach-Object { Copy-Item -Path $_.fullname -Destination "C:\destination" };

First line read each *.doc file and *.doc? (so it considers also Office 2010 .docx format), excluding Directories and duplicate files.
Second line copy each item from destination to source (the folder C:\destination must already exist).
In general I suggest you to split command over multiple lines because it's easier to produce code (in this case first task: get files, second task: copy files).