Powershell Web Page Automation works on Internet,

2019-01-25 00:17发布

问题:

I'm trying to do some simple automation with Powershell, pulling link URLs from one of our company's local intranet pages, and then doing some work with those URLs. Eventually I'll use the script to open each link and click a button on the page. I'm using Internet Explorer 9 in Windows 7 x64.

Here's an example of a simple working powershell script that displays all the links on a page:

$ie = new-object -com "InternetExplorer.Application"
$ie.Visible = $true
$ie.Navigate( "http://www.reddit.com" )
While ($ie.Busy) {
    Sleep 1
}

$links = $ie.Document.getElementsByTagName("a")
$links | foreach {
    write-host $_.href
}

This script works fine until I replace the URL with a local intranet site. It follows the normal URL scheme ( http://internaldomain.com/etc ), but it's recognized as an intranet site. Once I'm trying to scrape a page in the intranet zone, the $ie.Document value suddenly becomes NULL and the script fails.

I'm guessing it's related to some obscure setting for that zone... I'm not sure. I found some suggestions online such as adding it to your trusted sites, but that has not worked. This is my first time using Powershell for web automation, so any help or insight would be appreciated.

回答1:

Maybe the solution is here: http://blogs.msdn.com/b/ieinternals/archive/2011/08/03/internet-explorer-automation-protected-mode-lcie-default-integrity-level-medium.aspx

It explained the different levels of tabs, in ie. You have to use the "medium tab" to navigate in local zone.

Basically, the best way to keep your ie settings and use your script is to create a registry key, as explained in the link above.

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\InternetExplorer.ApplicationMedium]

[HKEY_CLASSES_ROOT\InternetExplorer.ApplicationMedium\CLSID] 
@="{D5E8041D-920F-45e9-B8FB-B1DEB82C6E5E}"

And in your script, use this new com object:

$ie = new-object -Com InternetExplorer.ApplicationMedium
...


回答2:

Use

$ie.Document.documentElement.getElementsByClassName("underline")

and enjoy .....



回答3:

Due to policy restrictions on my computer, I was not able to access the registry to create the key mentioned in another answer. However, I did find a way to do it indirectly using PowerShell in case this is helpful to anyone else:

$type = [Type]::GetTypeFromCLSID('D5E8041D-920F-45e9-B8FB-B1DEB82C6E5E')
$ie = [System.Activator]::CreateInstance($Type)

$ie.Visible = $true

$URL = "http://my.intranet.com"

$ie.Navigate($URL)

Write-Host "`$ie.Busy:" $ie.Busy
Write-Host "`$ie.ReadyState:" $ie.ReadyState

while($ie.Busy -or ($ie.ReadyState -ne 4) ) {
    Start-Sleep -s 1
}

Write-Host "IE is ready"