#-*- coding:utf-8 -*-
import win32com.client, pythoncom
import time
ie = win32com.client.DispatchEx('InternetExplorer.Application.1')
ie.Visible = 1
ie.Navigate('http://ieeexplore.ieee.org/xpl/periodicals.jsp')
time.sleep( 5 )
ie.Document.getElementById("browse_keyword").value ="Computer"
ie.Document.getElementsByTagName("input")[24].click()
import win32com.client, pythoncom
import time
ie = win32com.client.DispatchEx('InternetExplorer.Application')
ie.Visible = 1
ie.Navigate('www.baidu.com')
time.sleep(5)
print 'browse_keword'
ie.Document.getElementById("kw").value ="Computer"
ie.Document.getElementById("su").click()
print 'Done!'
When run the first section code,it will popup:
ie.Document.getElementById("browse_keyword").value ="Computer"
TypeError: getElementById() takes exactly 1 argument (2 given)
And the second section code runs ok. What is the difference that making the result different?
The difference between the two cases has nothing to do with the COM name you specify: either InternetExplorer.Application
or InternetExplorer.Application.1
result in the exact same CLSID which gives you an IWebBrowser2
interface. The difference in runtime behaviour is purely down to the URL you retrieved.
The difference here may be that the page which works is HTML whereas the other one is XHTML; or it may simply be that errors in the failing page prevent the DOM initialising properly. Whichever it appears to be a 'feature' of the IE9 parser.
Note that this doesn't happen if you enable compatibility mode (after the second line below I clicked the compatibility mode icon in the address bar):
(Pdb) ie.Document.DocumentMode
9.0
(Pdb) ie.Document.getElementById("browse_keyword").value
*** TypeError: getElementById() takes exactly 1 argument (2 given)
(Pdb) ie.Document.documentMode
7.0
(Pdb) ie.Document.getElementById("browse_keyword").value
u''
Unfortunately I don't know how to toggle compatibility mode from a script (the documentMode
property is not settable). Maybe someone else does?
The wrong argument count is, I think, coming from COM: Python passes in the arguments and the COM object rejects the call with a misleading error.
As a method of a COMObject
, getElementById
is built by win32com
dynamically.
On my computer, if url is http://ieeexplore.ieee.org/xpl/periodicals.jsp, it will be almost equivalent to
def getElementById(self):
return self._ApplyTypes_(3000795, 1, (12, 0), (), 'getElementById', None,)
If the url is www.baidu.com, it will be almost equivalent to
def getElementById(self, v=pythoncom.Missing):
ret = self._oleobj_.InvokeTypes(1088, LCID, 1, (9, 0), ((8, 1),),v
)
if ret is not None:
ret = Dispatch(ret, 'getElementById', {3050F1FF-98B5-11CF-BB82-00AA00BDCE0B})
return ret
Obviously, if you pass an argument to the first code, you'll receive a TypeError
. But if you try to use it directly, namely, invoke ie.Document.getElementById()
, you won't receive a TypeError
, but a com_error
.
Why win32com
built the wrong code?
Let us look at ie
and ie.Document
. They are both COMObject
s, more precisely, win32com.client.CDispatch
instances. CDispatch
is just a wrapper class. The core is attribute _oleobj_
, whose type is PyIDispatch
.
>>> ie, ie.Document
(<COMObject InternetExplorer.Application>, <COMObject <unknown>>)
>>> ie.__class__, ie.Document.__class__
(<class win32com.client.CDispatch at 0x02CD00A0>,
<class win32com.client.CDispatch at 0x02CD00A0>)
>>> oleobj = ie.Document._oleobj_
>>> oleobj
<PyIDispatch at 0x02B37800 with obj at 0x003287D4>
To build getElementById
, win32com
needs to get the type information for getElementById
method from _oleobj_
. Roughly, win32com
uses the following procedure
typeinfo = oleobj.GetTypeInfo()
typecomp = typeinfo.GetTypeComp()
x, funcdesc = typecomp.Bind('getElementById', pythoncom.INVOKE_FUNC)
......
funcdesc
contains almost all import information, e.g. the number and types of the parameters.
If url is http://ieeexplore.ieee.org/xpl/periodicals.jsp, funcdesc.args
is ()
, while the correc funcdesc.args
should be ((8, 1, None),)
.
Long story in short, win32com
had retrieved the wrong type information, thus it built the wrong method.
I am not sure who is to blame, PyWin32 or IE. But base on my observation, I found nothing wrong in PyWin32's code. On the other hand, the following script runs perfectly in Windows Script Host.
var ie = new ActiveXObject("InternetExplorer.Application");
ie.Visible = 1;
ie.Navigate("http://ieeexplore.ieee.org/xpl/periodicals.jsp");
WScript.sleep(5000);
ie.Document.getElementById("browse_keyword").value = "Computer";
Duncan has already pointed out IE's compatibility mode can prevent the problem. Unfortunately, it seems it's impossible to enable compatibility mode from a script.
But I found a trick, which can help us bypass the problem.
First, you need to visit a good site, which gives us a HTML page, and retrieve a correct Document
object from it.
ie = win32com.client.DispatchEx('InternetExplorer.Application')
ie.Visible = 1
ie.Navigate('http://www.haskell.org/arrows')
time.sleep(5)
document = ie.Document
Then jump to the page which doesn't work
ie.Navigate('http://ieeexplore.ieee.org/xpl/periodicals.jsp')
time.sleep(5)
Now you can access the DOM of the second page via the old Document
object.
document.getElementById('browse_keyword').value = "Computer"
If you use the new Document
object, you will get a TypeError
again.
>>> ie.Document.getElementById('browse_keyword')
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
TypeError: getElementById() takes exactly 1 argument (2 given)
Calls to methods of instances in Python automatically adds the instance as first argument - that's why you have to explicitly write the 'self' argument inside methods.
For example, instance.method(args...)
is equal to Class.method(instance, args...)
.
From what I see the programmer must have forgotten to write the self keyword, resulting in breaking the method. Try to look inside the library code.
I just got this issue when I upgraded to IE11 from IE8.
I've only tested this on the getElementsByTagName function. You have to call the function from the Body element.
#-*- coding:utf-8 -*-
import win32com.client, pythoncom
import time
ie = win32com.client.DispatchEx('InternetExplorer.Application.1')
ie.Visible = 1
ie.Navigate('http://ieeexplore.ieee.org/xpl/periodicals.jsp')
time.sleep( 5 )
ie.Document.Body.getElementById("browse_keyword").value ="Computer"
ie.Document.Body.getElementsByTagName("input")[24].click()