How to combine twill and python into one code that

2019-02-20 20:21发布

问题:

I have installed twill on my computer (having previously installed Python 2.5) and have been using it recently.

Python is installed on disk C on my computer: C:\Python25

And the twill folder (“twill-0.9”) is located here: E:\tmp\twill-0.9

Here is a code that I’ve been using in twill:

go “some website’s sign-in page URL”
formvalue 2 userid “my login”
formvalue 2 pass “my password”
submit
go “URL of some other page from that website”
save_html result.txt

This code helps me to log in to one website, in which I have an account, record the HTML code of some other page of that website (that I can access only after logging in), and store it in a file named “result.txt” (of course, before using this code I firstly need to replace “my login” with my real login, “my password” with my real password, “some website’s sign-in page URL” and “URL of some other page from that website” with real URLs of that website, and number 2 with the number of the form on that website that is used as a sign-in form on that website’s log-in page)

This code I store in “test.twill” file that is located in my “twill-0.9” folder: E:\tmp\twill-0.9\test.twill I run this file from my command prompt: python twill-sh test.twill

Now, I also have installed “Google App Engine SDK” from “Google App Engine” and have also been using it for awhile.

For example, I’ve been using this code:

import hashlib
m = hashlib.md5()
m.update("Nobody inspects")
m.update(" the spammish repetition ")
print m.hexdigest()

This code helps me transform the phrase “Nobody inspects the spammish repetition” into md5 digest.

Now, how can I put these two pieces of code together into one python script that I could run on “Google App Engine”?

Let’s say, I want my code to log in to a website from “Google App Engine”, go to another page on that website, record its HTML code (that’s what my twill code does) and than transform this HTML code into its md5 digest (that’s what my second code does). So, how can I combine those two codes into one python code?

I guess, it should be done somehow by importing twill, but how can it be done? Can a python code - the one that is being run by “Google App Engine” - import twill from somewhere on the internet? Or, perhaps, twill is already installed on “Google App Engine”?


Update 1:

(this update is my response to Wooble’s answer)

Here is the list of all folders (in my “twill-0.9” folder) that contain __init__.py files. (some folders on this list are located inside of other folders, which are also mentioned in this list) :

E:\twill-0.9\build\lib\twill\extensions\match_parse

E:\twill-0.9\build\lib\twill\extensions

E:\twill-0.9\build\lib\twill\other_packages\_mechanize_dist

E:\twill-0.9\build\lib\twill\other_packages

E:\twill-0.9\build\lib\twill

E:\twill-0.9\twill\extensions\match_parse

E:\twill-0.9\twill\extensions

E:\twill-0.9\twill\other_packages\_mechanize_dist

E:\twill-0.9\twill\other_packages

E:\twill-0.9\twill

回答1:

To use third-party libraries in App Engine projects, you simply have to include them with your application when you deploy. Copy the twill folder (the one containing __init__.py) into your application's folder and deploy it.

Looking at the twill Google Code project, it appears that twill includes its dependencies (pyparsing, mechanize, etc.) in the package, so you may not need to include anything else.



回答2:

here is an example of using twill to run a google search if this helps. It shows using twill and beautifulsoup together to parse web pages:

>>> import twill.commands
>>> import BeautifulSoup
>>> 
>>> class browser:
...    def __init__(self, url="http://www.google.com",log = None):
...       self.a=twill.commands
...       self.a.config("readonly_controls_writeable", 1)
...       self.b = self.a.get_browser()
...       self.b.set_agent_string("Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14")
...       self.log = log
...       self.b.clear_cookies()
...       self.url=url
...    def googleQuery(self, query="python code"):
...       self.b.go(self.url)
...       #self.b.showforms()
...       f = self.b.get_form("f")
...       #print "form is %s" % f
...       f["q"] = query
...       self.b.clicked(f, "btnG")
...       self.b.submit()
...       pageContent = self.b.get_html()
...       soup=BeautifulSoup.BeautifulSoup(pageContent)
...       ths = soup.findAll(attrs={"class" : "l"})
...       for a in ths:
...          print a
... 
>>> t=browser()
>>> t.googleQuery("twill queries")
==> at http://www.google.ie/
Note: submit is using submit button: name="btnG", value="Google Search"

<a href="http://pyparsing.wikispaces.com/WhosUsingPyparsing" class="l" onmousedown="return clk(this.href,'','','res','1','','0CBMQFjAA')">pyparsing - WhosUsingPyparsing</a>
<a href="http://www.mail-archive.com/twill@lists.idyll.org/msg00048.html" class="l" onmousedown="return clk(this.href,'','','res','2','','0CBcQFjAB')">Re: [<em>twill</em>] <em>query</em>: docs, and web site.</a>
<a href="http://www.mail-archive.com/twill@lists.idyll.org/msg00050.html" class="l" onmousedown="return clk(this.href,'','','res','3','','0CBkQFjAC')">Re: [<em>twill</em>] <em>query</em>: docs, and web site.</a>
<a href="http://www.genealogytoday.com/surname/finder.mv?Surname=Twill" class="l" onmousedown="return clk(this.href,'','','res','4','','0CB4QFjAD')"><em>Twill</em> Genealogy and Family Tree Resources - Surname Finder</a>
<a href="http://a706cheap-apparel.hobby-site.com/ladies-cotton-faded-twill-le-chameau-breeks-42" class="l" onmousedown="return clk(this.href,'','','res','5','','0CCEQFjAE')">Ladies Cotton Faded <em>Twill</em> Le Chameau Breeks 42</a>
<a href="http://twill.idyll.org/examples.html" class="l" onmousedown="return clk(this.href,'','','res','6','','0CCMQFjAF')"><em>twill</em> Examples</a>
<a href="http://panjiva.com/Sri-Lankan-Manufacturers-Of/twill+capri" class="l" onmousedown="return clk(this.href,'','','res','7','','0CCcQFjAG')">Sri-Lankan <em>Twill</em> Capri Manufacturers | Sri-Lankan Suppliers of <b>...</b></a>
<a href="http://c586cheap-apparel.dyndns.ws/twill-beige-blazer" class="l" onmousedown="return clk(this.href,'','','res','8','','0CCoQFjAH')"><em>Twill</em> beige blazer</a>
<a href="http://stackoverflow.com/questions/2267537/how-do-you-use-relative-paths-for-twill-tests" class="l" onmousedown="return clk(this.href,'','','res','9','','0CCwQFjAI')">How do you use Relative Paths for <em>Twill</em> tests? - Stack Overflow</a>
<a href="http://mytextilenotes.blogspot.com/2010/01/introduction-to-twill-weave.html" class="l" onmousedown="return clk(this.href,'','','res','10','','0CC8QFjAJ')">My Textile Notes: Introduction to <em>Twill</em> Weave</a>
>>>  


回答3:

No idea what twill does (well, googled), but AppEngine offers fetch() function which can be used to fetch web pages. It also supports POST method e.g. for logins.

(I doubt twill works in AppEngine, because AppEngine has limited python libraries available for security reasons. Just a guess, though.)



回答4:

I believe you're looking for a way to import the twill module into App-Engine. You'll have to figure out either where the twill python files are or how to get a source package of them to package it with your website, but it looks like importing 3rd party modules can be done with a few exceptions, see below.

Try ZipImport following the directions from Google's site here and here.

from Google's third Party Library page:

App Engine uses a custom version of the zipimport feature instead of the standard implementation. It generally works the usual way: add the Zip archive to sys.path, then import as usual. With these exceptions: zipimport can only import modules stored in the archive as .py source files. It cannot import modules stored as .pyc or .pyo files. zipimport is implemented in pure Python, and does not use native code for decompression (C code).