Issue using an array containing criteria with wild

2020-02-01 17:39发布

问题:

I've been trying to run a powershell code to split a file in 2.

I've got a couple of regular arrays, that works just fine. The 3rd array contains wildcards for each of the criteria, and that doesn't work at all.

I've tried -in -notin, -like -notlike, -contains -notcontains, -match -notmatch, i'm not getting the results I want.

    $NonAutoStructure = @("Not_Found", "UK Training Centre", "IRISH Training Centre", "Head Office", "UK Newmedica")
$AutoJournalDescriptions = @("STORE TRANFrom *",  "*SALES BANKED*")#, "*/* CREDIT" , "BANKING DIFF*BQ*" , "*/* MASTERCARD/VISA")  
$InactiveStores = @("4410", "0996", "1015", "5996")


$NonAutoJournalCompanies = {$_.Description -notcontains $AutoJournalDescriptions} 
$AutoJournalCompanies = {$_.Description -contains $AutoJournalDescriptions}
#$NonAutoJournalCompanies = {$_.structure -in $NonAutoStructure -or $_.Company -in $InactiveStores -and  $_.Amount -ne "0.00"}
#$AutoJournalCompanies = {$_.structure -notin $NonAutoStructure-and $_.Company -notin $InactiveStores -and  $_.Amount -ne "0.00"}

$UNREC_S0 | Where-Object $NonAutoJournalCompanies | Export-Csv \\774512-LRBSPT01\*****$\uardata\rt1\BankRec\Test\step1\TestNonAutoJournal.txt -notype
$UNREC_S0 | Where-Object $AutoJournalCompanies | Export-Csv \\774512-LRBSPT01\*****$\uardata\rt1\BankRec\Test\step1\TestAutoJournal.txt -notype
$UNREC_S0 | Where-Object $ZeroValuelines | Export-Csv \\774512-LRBSPT01\*****$\uardata\rt1\BankRec\Test\step1\TestZeroLines.txt -notype

The Array I have issues with is the $AutoJournalDescriptions. I can only get it working if the array contains a single criteria. Otherwise, it seems to ignore them all. Here it only contains a couple, but the criterias after the # should be included too. I'm trying to include and exclude these criterias as part of #(Non)AutojournalCompanies files so that all data is preseverd, but separated and can then be directed towards different process streams.

Perhaps i'm simply trying to use a function that isn't meant to work that way...? I've been searching for a solution all day to no avail. I could type all those criterias individually in the file production criteria, but that makes it heavy to read and cumbersome to maintain. I would prefer to enrich/modify the array when changes are required.

I hope that all makes sense. I'm pretty new to powershell.

Many thanks,

Antoine

回答1:

  • In order match against wildcard patterns (such as *SALES BANKED*), you need the -like operator; by contrast, -contains performs equality comparisons (implicit -eq against each array element).

  • While these operators (along with others, such as -eq and -match) support an array of input values[1], the comparison operand (typically, the RHS) must be a scalar (single value) - you cannot compare the input array against multiple values at the same time.


In your scenario, your best bet is to use regexes (regular expressions) rather than wildcard expressions, and to combine them into a single regex with the alternation operator (|), so you can use a single -match operation to test for multiple patterns:

# Sample input
$UNREC_S0  = [pscustomobject] @{ Description = 'A SALES BANKED baz' }, 
             [pscustomobject] @{ Description = 'bar' }, 
             [pscustomobject] @{ Description = 'STORE TRANFrom foo' }, 
             [pscustomobject] @{ Description = 'unrelated' }

# The filtering criteria: *regexes* to match against the descriptions,
# combined into a single regex with the alternation operator, '|'
$AutoJournalDescriptions = '^STORE TRANFrom ', 'SALES BANKED' -join '|'

# Construct script blocks to use with `Where-Object` below.
$NonAutoJournalCompanies = { $_.Description -notmatch $AutoJournalDescriptions } 
$AutoJournalCompanies =    { $_.Description -match $AutoJournalDescriptions}

$UNREC_S0 | Where-Object $NonAutoJournalCompanies | Export-Csv \\774512-LRBSPT01\*****$\uardata\rt1\BankRec\Test\step1\TestNonAutoJournal.txt -notype
# ...

The above yields the following CSV data, showing that only the descriptions not matching the regexes were exported:

"Description"
"bar"
"unrelated"

Note how regex ^STORE TRANFrom corresponds to wildcard expression STORE TRANFrom *, and SALES BANKED to *SALES BANKED*.

The wildcard * operator - which normally correspond to .* in a regex - isn't needed in the regexes here, because the -match operator implicitly performs substring matching (whereas wildcard-matching with -like matches against the whole input string).


Optional reading: Filtering an array of string values by an array of substrings or patterns:

If you formulate your criteria as regexes (regular expressions), you can use the Select-String cmdlet, which does support multiple comparison operands:

# Sample input
$descriptions = 'A SALES BANKED baz', 'bar', 'STORE TRANFrom foo', 'unrelated'

# The filtering criteria: *regexes* to match against the descriptions.
$descriptionRegexes = '^STORE TRANFrom ', 'SALES BANKED'

($descriptions | Select-String -Pattern $descriptionRegexes).Line

Note: You can also use this technique for finding literal substrings, by using -SimpleMatch instead of -Pattern, but note that substrings are then matched anywhere in each input string, without being able to restrict matching to, say, the start of the string.

The above outputs the following (a 2-element array):

A SALES BANKED baz
STORE TRANFrom foo

You can use a similar approach by combining the individual regexes into a single one with the alternation (|) operator, which enables use of the -match operator:

# Sample input
$descriptions = 'A SALES BANKED baz', 'bar', 'STORE TRANFrom foo', 'unrelated'

# The filtering criteria: *regexes* to match against the descriptions,
# combined into a single regex with the alternation operator, '|'
$descriptionRegex = '^STORE TRANFrom ', 'SALES BANKED' -join '|'
# -> '^STORE TRANFrom |SALES BANKED'

$descriptions -match $descriptionRegex

You can also adapt this approach to literal substring matching, namely by escaping the substrings for literal use inside a regex with [regex]::Escape(); e.g.,
$descriptionRegex = ('yes?', '2.0').ForEach({ [regex]::Escape($_) }) -join '|'


Otherwise, if you do need wildcard support, you'll have to - inefficiently - nest loops (see shortcut below, if you can make specific assumptions):

# Sample input
$descriptions = 'A SALES BANKED baz', 'bar', 'STORE TRANFrom foo', 'unrelated' 

# The filtering criteria: wildcard patterns to match against the descriptions.
$descriptionWildcards = 'STORE TRANFrom *', '*SALES BANKED*'

foreach ($descr in $descriptions) {
  foreach ($wildcard in $descriptionWildcards) {
    if ($descr -like $wildcard) { $descr; break }
  }
}

Note that I've used foreach statements rather than the pipeline with a ForEach-Object cmdlet call; the former is faster, the latter can keep memory consumption constant if the input is being streamed; with arrays already in memory, in full, the foreach statement is the better choice.


You can take a shortcut, IF you can make two assumptions:

  • No single wildcard pattern matches more than one input.

  • The input order needn't be preserved; that is, it is acceptable that the output order of descriptions reflects the order of the entries in the wildcard-pattern array, not the order of the input descriptions.

# Sample input
$descriptions = 'A SALES BANKED baz', 'bar', 'STORE TRANFrom foo', 'unrelated' 

# The filtering criteria: wildcard patterns to match against the descriptions.
$descriptionWildcards = 'STORE TRANFrom *', '*SALES BANKED*'

# Loop over the criteria and match the descriptions against each.
# `foreach` is the built-in alias for the `ForEach-Object` cmdlet.
# The output order will be reflect the order of the wildcard patterns.
$descriptionWildcards | foreach { $descriptions -like $_ }

In this case, while the resulting elements are the same, their ordering differs:

STORE TRANFrom foo
A SALES BANKED baz

[1] With an array of values as input, these operators act like filters: that is, they return the sub-array of matching values; e.g., 1, 2, 3 -eq 2 returns 2 as a single-element array.