I am trying to access the document of an internet explorer com object with windows 2012. The code works great in windows 2008 but as soon as I try to run it on windows 2012 (fresh install, tried on more than one server), the same code stops working. In other words, $ie.document.documentHtml returns as null.
Below is the code:
$ie = new-object -com "InternetExplorer.Application"
$ie.navigate2("http://www.example.com/")
while($ie.busy) {start-sleep 1}
$ie.document.documentHtml.innerhtml
Has the interexplorer com object changed in windows 2012? and if yes, how do I do I retrieve the document contents in windows 2012?
Thanks in advance
edit: Added a bounty to sweeten things up. Invoke-WebRequest is nice but it works only on windows 2012 but I need to use internet explorer and have it work both on windows 2008 and windows 2012. I have read somewhere that installing microsoft office solves the issue. It is not an option either.
edit2: as I need to remotely invoke the script on multiple windows server (both 2008 and 2012), I would prefer not to copy files manually
The bigger question is how this ever could have worked. The
Document
property returns a reference to the IHTMLDocument interface, it does not have a "documentHtml" property. It is never that clear what you might get back when you use late binding as was done in this code. There is an old documentHtml property supported by the DHTML Editing control, that has been firmly put to the pasture. Admittedly rather a wild guess.Anyhoo, correct syntax is to use, say, the
body
property:If you still have problems, Powershell does treat null references rather undiagnosably, then try running this C# code on the machine. Ought to give you a better message:
It's a know bug: http://connect.microsoft.com/PowerShell/feedback/details/764756/powershell-v3-internetexplorer-application-issue
An extract from the workaround:
So, here's a workaround:
Microsoft.html.dll
from a location where it is installed (eg: from C:\Program Files(x86)\Microsoft.NET\Primary Interop Assemblies to your script's location (can be a network drive)Load-Assembly.ps1
script (code provided below and at: http://sdrv.ms/U6j7Wn) to load the assembly types in memory eg: .\Load-Assembly.ps1 -Path .\microsoft.mshtml.dllThen proceed as usual to create the IE object etc. Warning: when dealing with the write() and writeln() methods use the backward compatible methods: IHTMLDocument2_write() and IHTMLDocument2_writeln().
As far as I can tell, on Windows Server 2012 to get the full html of a page:
There is also an
innerhtml
property on thedocumentElement
, which strips off the root<html>
element.Of course, if all you want to do is get the raw markup, consider using
Invoke-WebRequest
: