可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I'm trying to download file with Python using IE:
from win32com.client import DispatchWithEvents
class EventHandler(object):
def OnDownloadBegin(self):
pass
ie = DispatchWithEvents("InternetExplorer.Application", EventHandler)
ie.Visible = 0
ie.Navigate('http://website/file.xml')
After this, I'm getting a window asking the user where to save the file. How can I save this file automatically from python?
I need to use some browser, not urllib or mechanize, because before downloading file I need to interact with some ajax functionality.
回答1:
This works for me as long as the IE dialogs are in the foreground and the downloaded file does not already exist in the "Save As" directory:
import time
import threading
import win32ui, win32gui, win32com, pythoncom, win32con
from win32com.client import Dispatch
class IeThread(threading.Thread):
def run(self):
pythoncom.CoInitialize()
ie = Dispatch("InternetExplorer.Application")
ie.Visible = 0
ie.Navigate('http://website/file.xml')
def PushButton(handle, label):
if win32gui.GetWindowText(handle) == label:
win32gui.SendMessage(handle, win32con.BM_CLICK, None, None)
return True
IeThread().start()
time.sleep(3) # wait until IE is started
wnd = win32ui.GetForegroundWindow()
if wnd.GetWindowText() == "File Download - Security Warning":
win32gui.EnumChildWindows(wnd.GetSafeHwnd(), PushButton, "&Save");
time.sleep(1)
wnd = win32ui.GetForegroundWindow()
if wnd.GetWindowText() == "Save As":
win32gui.EnumChildWindows(wnd.GetSafeHwnd(), PushButton, "&Save");
回答2:
I don't know how to say this nicely, but this sounds like about the most foolhardy software idea in recent memory. Python is much more capable of performing AJAX calls than IE is.
To access the data, yes, you can use urllib
and urllib2
. If there is JSON data in the response, there's the json
library; likewise for XML and HTML, there's BeautifulSoup
.
For one project, I had to write a Python program that would simulate a browser and sign into any of 20 different social networks (remember Friendster? Orkut? CyberWorld? I do), and upload images and text into the user's account, even grasping CAPTCHAs and complex JavaScript interactions. Pure Python makes it (comparatively) easy; as you've already seen, trying to use IE makes it impossible.
回答3:
pamie perhaps
P.A.M.I.E. - stands for Python
Automated Module For I.E.
Pamie's main use is for testing web
sites by which you automate the
Internet Explorer client using the
Pamie scripting language. PAMIE is
not a record playback engine!
Pamie allows you to automate I.E. by
manipulating I.E.'s Document Object
Model via COM. This Free tool is for
use by Quality Assurance Engineers
and Developers.
回答4:
If you can't control Internet Explorer using its COM interface, I suggest using the AutoIt COM to control its GUI from Python.
回答5:
You don't need to use IE. You could use something like
import urllib2
data = urllib2.urlopen("http://website/file.xml").read()
Update: I see you've updated your question. If you need to use a browser, then clearly this answer isn't appropriate for you.
Further update: When you click the button which is generated by JavaScript, if the URL retrieved is not computed by the JavaScript, and only the button is, then you can perhaps retrieve that URL via urllib2
. On the other hand, you might also need to pass a session cookie from your authenticated session.
回答6:
One option could also be to embed your own browser.
Thats e.g. possible with Qt via PyQt (GPL) or PySide (LGPL). There you could embed the WebKit engine. You could then either display the page in a QWebView and let the user navigate to your download and filter that event or use a simple QWebPage where everything could be automated and nothing has to be shown at all.
And WebKit should be mighty enough to do anything you want.
Very basic example:
import sys
from PySide import QtCore, QtGui, QtWebKit
url = 'http://developer.qt.nokia.com/wiki/PySideDownloads/'
class TestKit(QtCore.QObject):
def __init__(self, app):
self.page = QtWebKit.QWebPage()
self.page.loadFinished.connect(self.finished)
self.page.mainFrame().load(QtCore.QUrl(url))
self.app = app
def finished(self, evt):
# inspect DOM -> navigate to next page or download
print self.page.currentFrame().documentElement().toInnerXml().encode(
'utf-8')
# when everything is done
self.app.quit()
if __name__ == '__main__':
app = QtGui.QApplication(sys.argv)
t = TestKit(app)
sys.exit(app.exec_())
回答7:
I have something like that (an awful 3rd part application with lots of weird dotnet 'ajax' controls), and I use iMacros plugin for Firefox to do some automation. But I'm doing batch inserts, not downloads.
You can try record, edit and replay the inputs sent thru a VNC session. Look at something like http://code.google.com/p/python-vnc-viewer/ for inspiration.
回答8:
This is definitely absolutely the last way I would normally do this but today I did have to resort to banging away to get something working. I have IE 10 so @cgohlke's answer won't work (no window text). All attempts to get a proper version of Client Authentication working were failing so had to fall back on this. Maybe it'll help someone else who's equally at the end of their tether.
import IEC
import pywinauto
import win32.com
# Creates a new IE Window
ie = IEC.IEController(window_num=0)
# Register application as an app for pywinauto
shell = win32com.client.Dispatch("WScript.Shell")
pwa_app = pywinauto.application.Application()
w_handle = pywinauto.findwindows.find_windows(title=u'<Title of the site - find it using SWAPY>', class_name='IEFrame')[0]
window = pwa_app.window_(handle=w_handle)
window.SetFocus()
# Click on the download link
ie.ClickLink(<download link>)
# Get the handle of the Open Save Cancel dialog
ctrl = window['2']
# You may need to adjust the coords here to make sure you hit the button you want
ctrl.ClickInput(button='left', coords=(495, 55), double=False, wheel_dist=0)
But man, is it horrible!