Replace multiline text in a file using Powershell

2020-03-26 06:09发布

I have the following Powershell script:

$oldCode =  @"
            <div id="time_estimate">
                <!-- some large table -->
            </div>
"@

$newCode = @"
            <div id="time_estimate">
                                <!-- nested divs and spans -->
                                <div id="contact-form">

                                        <?php include "contact-form.php"; ?>
                                </div>
                        </div>
"@

ls *.html | foreach { 
        $fileContent = [System.Io.File]::ReadAllText($_.FullName)
        $newFileContent = $fileContent.Replace($oldCode, $newCode)
        [System.Io.File]::WriteAllText($_.FullName, $newFileContent)
        Write-Host  "`r`n"
        Write-Host  "Processed - $($_.Name)...`r`n" }

This doesn't seem to be replacing the text. Is it an issue with the multiline strings, or the limits of the Replace() method? I would prefer to do the replace without bringing in regex.

3条回答
放我归山
2楼-- · 2020-03-26 06:37

I wouldn't use string replacements for modifying HTML code. To many things that could develop in unexpected directions. Try something like this:

$newCode = @"
<!-- nested divs and spans -->
<div id="contact-form">
  <?php include "contact-form.php"; ?>
</div>
"@

Get-ChildItem '*.html' | % {
  $html = New-Object -COM HTMLFile
  $html.write([IO.File]::ReadAllText($_.FullName))
  $html.getElementById('time_estimate').innerHTML = $newCode
  [IO.File]::WriteAllText($_.FullName, $html.documentElement.outerHTML)
}

If needed you can can prettify the HTML by using Tidy:

$newCode = @"
<!-- nested divs and spans -->
<div id="contact-form">
  <?php include "contact-form.php"; ?>
</div>
"@

[Reflection.Assembly]::LoadFile('C:\path\to\Tidy.dll') | Out-Null
$tidy = New-Object Tidy.DocumentClass

Get-ChildItem '*.html' | % {
  $html = New-Object -COM HTMLFile
  $html.write([IO.File]::ReadAllText($_.FullName))
  $html.getElementById('time_estimate').innerHTML = $newCode
  $tidy.ParseString($html.documentElement.outerHTML)
  $tidy.SaveFile($_.FullName) | Out-Null
}
查看更多
▲ chillily
3楼-- · 2020-03-26 06:52

What version of PowerShell are you using? If you're using v3 or higher, try this:

ls *.html | foreach { 
    $fileContent = Get-Content $_.FullName -Raw
    $newFileContent = $fileContent -replace $oldCode, $newCode
    Set-Content -Path $_.FullName -Value $newFileContent
    Write-Host  "`r`n"
    Write-Host  "Processed - $($_.Name)...`r`n" 
}
查看更多
Summer. ? 凉城
4楼-- · 2020-03-26 06:59

For Pete's sake, don't even think about using regex for HTML.

The problem you met is that reading a file will provide you an array of strings. Replace() doesn't know about arrays, so you got to work it by hand. You could create a big string with -join like so,

$fileContent = [System.Io.File]::ReadAllText($_.FullName)
$theOneString = $fileContent -join ' '
$theOneString.Replace($foo, $bar)

... But this will mess up your line breaks. Then again, you could reformat the string with HTML Tidy.

The manual way is to iterate the source array row by row. Until you find the <div>, copy the contents into new destination array. After finding the replacable part, insert rest of the new stuff into the destination array. Keep reading and discarding the source array untill you find the </div> and copy all the rest into the destination array. Finally save the destination array's contents and you are done.

查看更多
登录 后发表回答