#-*- coding:utf-8 -*-
import win32com.client, pythoncom
import time
ie = win32com.client.DispatchEx('InternetExplorer.Application.1')
ie.Visible = 1
ie.Navigate('http://ieeexplore.ieee.org/xpl/periodicals.jsp')
time.sleep( 5 )
ie.Document.getElementById("browse_keyword").value ="Computer"
ie.Document.getElementsByTagName("input")[24].click()
import win32com.client, pythoncom
import time
ie = win32com.client.DispatchEx('InternetExplorer.Application')
ie.Visible = 1
ie.Navigate('www.baidu.com')
time.sleep(5)
print 'browse_keword'
ie.Document.getElementById("kw").value ="Computer"
ie.Document.getElementById("su").click()
print 'Done!'
When run the first section code,it will popup:
ie.Document.getElementById("browse_keyword").value ="Computer"
TypeError: getElementById() takes exactly 1 argument (2 given)
And the second section code runs ok. What is the difference that making the result different?
I just got this issue when I upgraded to IE11 from IE8.
I've only tested this on the getElementsByTagName function. You have to call the function from the Body element.
The difference between the two cases has nothing to do with the COM name you specify: either
InternetExplorer.Application
orInternetExplorer.Application.1
result in the exact same CLSID which gives you anIWebBrowser2
interface. The difference in runtime behaviour is purely down to the URL you retrieved.The difference here may be that the page which works is HTML whereas the other one is XHTML; or it may simply be that errors in the failing page prevent the DOM initialising properly. Whichever it appears to be a 'feature' of the IE9 parser.
Note that this doesn't happen if you enable compatibility mode (after the second line below I clicked the compatibility mode icon in the address bar):
Unfortunately I don't know how to toggle compatibility mode from a script (the
documentMode
property is not settable). Maybe someone else does?The wrong argument count is, I think, coming from COM: Python passes in the arguments and the COM object rejects the call with a misleading error.
Calls to methods of instances in Python automatically adds the instance as first argument - that's why you have to explicitly write the 'self' argument inside methods.
For example,
instance.method(args...)
is equal toClass.method(instance, args...)
.From what I see the programmer must have forgotten to write the self keyword, resulting in breaking the method. Try to look inside the library code.
As a method of a
COMObject
,getElementById
is built bywin32com
dynamically.On my computer, if url is http://ieeexplore.ieee.org/xpl/periodicals.jsp, it will be almost equivalent to
If the url is www.baidu.com, it will be almost equivalent to
Obviously, if you pass an argument to the first code, you'll receive a
TypeError
. But if you try to use it directly, namely, invokeie.Document.getElementById()
, you won't receive aTypeError
, but acom_error
.Why
win32com
built the wrong code?Let us look at
ie
andie.Document
. They are bothCOMObject
s, more precisely,win32com.client.CDispatch
instances.CDispatch
is just a wrapper class. The core is attribute_oleobj_
, whose type isPyIDispatch
.To build
getElementById
,win32com
needs to get the type information forgetElementById
method from_oleobj_
. Roughly,win32com
uses the following procedurefuncdesc
contains almost all import information, e.g. the number and types of the parameters.If url is http://ieeexplore.ieee.org/xpl/periodicals.jsp,
funcdesc.args
is()
, while the correcfuncdesc.args
should be((8, 1, None),)
.Long story in short,
win32com
had retrieved the wrong type information, thus it built the wrong method.I am not sure who is to blame, PyWin32 or IE. But base on my observation, I found nothing wrong in PyWin32's code. On the other hand, the following script runs perfectly in Windows Script Host.
Duncan has already pointed out IE's compatibility mode can prevent the problem. Unfortunately, it seems it's impossible to enable compatibility mode from a script.
But I found a trick, which can help us bypass the problem.
First, you need to visit a good site, which gives us a HTML page, and retrieve a correct
Document
object from it.Then jump to the page which doesn't work
Now you can access the DOM of the second page via the old
Document
object.If you use the new
Document
object, you will get aTypeError
again.