Changing the link in python mechanize

2019-06-03 16:57发布

问题:

I am trying to write a python script that will generate the rank-list of my batch. For this I simply need to change the roll-number parameter of the link using inspect element feature in web-browser. The link(relative) looks something like:

/academic/utility/AcademicRecord.jsp?loginCode=000&loginnumber=000&loginName=name&Home=ascwebsite

I just need to change the loginCode to get the grade of my batch-mates. I am trying to use python to iterate through all the roll-numbers and generate a rank-list. I used mechanize library to open the site using python. The relevant portion of code:

br = mechanize.Browser()
br.set_handle_robots(False)
response = br.open('link_to_the_page')

I then do the necessary authentication and navigate to the appropriate page where the link to view the grades reside.
Then I find the relevant link like this:

for link in br.links(url_regex='/academic/utility/AcademicRecord.jsp?'):

Now inside this I change the url and attributes of the link appropriately. And then I open the link using:

response=br.follow_link(link)
print response.read()

But it does not work. It opens the same link i.e. with the initial roll number. In fact I tried changing the url of the link to something very different like http://www.google.com.

link.url='http://www.google.com'
link.base_url='http://www.google.com'

It still opens the same page and not google's page. Any help would be highly appreciated.

回答1:

According to the source code, follow_link() and click_link() use link's absolute_url property that is set during the link initialization. And, you are setting only url and base_url properties.

The solution would be to change the absolute_url of a link in the loop:

BASE_URL = 'link_to_the_page'
for link in br.links(url_regex='/academic/utility/AcademicRecord.jsp?'):
    modified_link = ...
    link.absolute_url = mechanize.urljoin(BASE_URL, modified_link)
    br.follow_link(link)

Hope that helps.