mechanize._mechanize.FormNotFoundError: no form ma

2019-03-02 20:09发布

Can anyone help me get this form selection correct?

Trying to get a crawl of google, I get the error: mechanize._mechanize.FormNotFoundError: no form matching name 'q'

Unusual, since I have seen several other tutorials using it, and: p.s. I don't plan to SLAM google with requests, just hope to use an automatic selector to take the effort out of finding academic citation pdfs from time to time.

<f GET http://www.google.com.tw/search application/x-www-form-urlencoded
  <HiddenControl(ie=Big5) (readonly)>
  <HiddenControl(hl=zh-TW) (readonly)>
  <HiddenControl(source=hp) (readonly)>
  <TextControl(q=)>
  <SubmitControl(btnG=Google ?j?M) (readonly)>
  <SubmitControl(btnI=?n???) (readonly)>
  <HiddenControl(gbv=1) (readonly)>>
>>> quit()




import os, subprocess
import re
import mechanize
from bs4 import BeautifulSoup
#prepare mechanize
br = mechanize.Browser()
br.set_handle_robots(False)
br.set_handle_equiv(False)
br.addheaders = [('User-agent', 'Mozilla/5.0')] 
br.open('http://www.google.com/')
br.select_form('q')
citation = ' www.stackoverflow.com '.strip() 
#citation = GOOGLE_BASE + Citation
print citation
br.open('http://www.google.com/')
br.select_form('q')
br.form['q'] = citation
br.submit()
data = br.read()
soup = BeautifulSoup(data)
print soup

1条回答
等我变得足够好
2楼-- · 2019-03-02 20:26

You are trying to select a form named q, which does not exist. It seems that the form is named f instead. (However, I was unable to verify that in my browser - even with Javascript disabled, I only saw a different name.)

A simple Google search can be done like this:

import os, subprocess
import re
import mechanize
from bs4 import BeautifulSoup

#prepare mechanize
br = mechanize.Browser()
br.set_handle_robots(False)
br.set_handle_equiv(False)
br.addheaders = [('User-agent', 'Mozilla/5.0')] 
br.open('http://www.google.com/')

# do the query
br.select_form(name='f')   # Note: select the form named 'f' here
br.form['q'] = 'here goes your query' # query
data = br.submit()

# parse and output
soup = BeautifulSoup(data.read())
print soup

This should give you the idea.

Update: How to find the right form 'selector'

To print the names of the available forms, you can do:

for form in br.forms():
    print form.name

This comes in handy when you use the interactive console.

You are not bound to use the name of the form, but you may give other hints to select the right form. For example, on some pages the forms have no name at all. Then you can still select based on the number of the form, e.g. br.select_form(nr=1) for the second form on the page. Please see help(br.select_form) for details. Also, list(br.forms()) will give you a list of all forms which you can inspect further.

Another option would be to inspect the page by hand in your usual browser.

查看更多
登录 后发表回答