How to strip illegal characters before trying to s

2019-01-17 17:01发布

问题:

I was able to find how to use the GetInvalidFileNameChars() method in a PowerShell script. However, it seems to also filter out whitespace (which is what I DON'T want).

EDIT: Maybe I'm not asking this clearly enough. I want the below function to INCLUDE the spaces that already existing in filenames. Currently, the script filters out spaces.

Function Remove-InvalidFileNameChars {

param([Parameter(Mandatory=$true,
    Position=0,
    ValueFromPipeline=$true,
    ValueFromPipelineByPropertyName=$true)]
    [String]$Name
)

return [RegEx]::Replace($Name, "[{0}]" -f ([RegEx]::Escape([String][System.IO.Path]::GetInvalidFileNameChars())), '')}

回答1:

Casting the character array to System.String actually seems to join the array elements with spaces, meaning that

[string][System.IO.Path]::GetInvalidFileNameChars()

does the same as

[System.IO.Path]::GetInvalidFileNameChars() -join ' '

when you actually want

[System.IO.Path]::GetInvalidFileNameChars() -join ''

As @mjolinor mentioned (+1), this is caused by the output field separator ($OFS).

Evidence:

PS C:\> [RegEx]::Escape([string][IO.Path]::GetInvalidFileNameChars())
"\ \ \|\  \ ☺\ ☻\ ♥\ ♦\ ♣\ ♠\ \\ \t\ \n\ ♂\ \f\ \r\ ♫\ ☼\ ►\ ◄\ ↕\ ‼\ ¶\ §\ ▬\ ↨\ ↑\ ↓\ →\ ←\ ∟\ ↔\ ▲\ ▼\ :\ \*\ \?\ \\\ /
PS C:\> [RegEx]::Escape(([IO.Path]::GetInvalidFileNameChars() -join ' '))
"\ \ \|\  \ ☺\ ☻\ ♥\ ♦\ ♣\ ♠\ \\ \t\ \n\ ♂\ \f\ \r\ ♫\ ☼\ ►\ ◄\ ↕\ ‼\ ¶\ §\ ▬\ ↨\ ↑\ ↓\ →\ ←\ ∟\ ↔\ ▲\ ▼\ :\ \*\ \?\ \\\ /
PS C:\> [RegEx]::Escape(([IO.Path]::GetInvalidFileNameChars() -join ''))
"\| ☺☻♥♦\t\n♂\f\r♫☼►◄↕‼¶§▬↨↑↓→←∟↔▲▼:\*\?\\/
PS C:\> $OFS=''
PS C:\> [RegEx]::Escape([string][IO.Path]::GetInvalidFileNameChars())
"\| ☺☻♥♦\t\n♂\f\r♫☼►◄↕‼¶§▬↨↑↓→←∟↔▲▼:\*\?\\/

Change your function to something like this:

Function Remove-InvalidFileNameChars {
  param(
    [Parameter(Mandatory=$true,
      Position=0,
      ValueFromPipeline=$true,
      ValueFromPipelineByPropertyName=$true)]
    [String]$Name
  )

  $invalidChars = [IO.Path]::GetInvalidFileNameChars() -join ''
  $re = "[{0}]" -f [RegEx]::Escape($invalidChars)
  return ($Name -replace $re)
}

and it should do what you want.



回答2:

I suspect it has to do with non-display characters being coerced to [string] for the regex operation (and ending up expressed as spaces).

See if this doesn't work better:

([char[]]$name | where { [IO.Path]::GetinvalidFileNameChars() -notcontains $_ }) -join ''

That will do a straight char comparison, and seems to be more reliable (embedded spaces are not removed).

$name = 'abc*\ def.txt'
([char[]]$name | where { [IO.Path]::GetinvalidFileNameChars() -notcontains $_ }) -join ''

abc def.txt

Edit - I believe @Ansgar is correct about the space being caused by casting the character array to string. The space is being introduced by $OFS.



回答3:

I wanted spaces to replace all the illegal characters so space is replaced with space

$Filename = $ADUser.SamAccountName
[IO.Path]::GetinvalidFileNameChars() | ForEach-Object {$Filename = $Filename.Replace($_," ")}
$Filename = "folder\" + $Filename.trim() + ".txt"


回答4:

Please try this one-liner with the same underlying function.

to match

'?Some "" File Name <:.txt' -match ("[{0}]"-f (([System.IO.Path]::GetInvalidFileNameChars()|%{[regex]::Escape($_)}) -join '|'))

to replace

'?Some "" File Name <:.txt' -replace ("[{0}]"-f (([System.IO.Path]::GetInvalidFileNameChars()|%{[regex]::Escape($_)}) -join '|')),'_'



回答5:

My current favourite way to accomplish this is:

$Path.Split([IO.Path]::GetInvalidFileNameChars()) -join '_'

This replaces all invalid characters with _ and is very human readable.



回答6:

[System.IO.Path]::GetInvalidFileNameChars() returns an array of invalid chars. If it is returning the space character for you (which it does not do for me), you could always iterate over the array and remove it.

> $chars = @()
> foreach ($c in [System.IO.Path]::GetInvalidFileNameChars())
  {
     if ($c -ne ' ')
     {
        $chars += $c
     }
  }

Then you can use $chars as you would have used the output from GetInvalidFileNameChars().