google search with python requests library

(I've tried looking but all of the other answers seem to be using urllib2)

I've just started trying to use requests, but I'm still not very clear on how to send or request something additional from the page. For example, I'll have

import requests

r = requests.get('http://google.com')

but I have no idea how to now, for example, do a google search using the search bar presented. I've read the quickstart guide but I'm not very familiar with HTML POST and the like, so it hasn't been very helpful.

Is there a clean and elegant way to do what I am asking?

标签： python python-requests google-search google-search-api

4条回答

傲

2楼-- · 2019-01-12 01:23

import requests                                                                                                                                                                                                    

def googlesearch(searchfor):                                                                                                                                                                                       
    link = 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&%s' % searchfor                                                                                                                              
    ua = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.116 Safari/537.36'}                                                                
    payload = {'q': searchfor}                                                                                                                                                                                     
    response = requests.get(link, headers=ua, params=payload)                                                                                                                                                      
    print response.text                                                                                                                                                                                            

googlesearch('test')

Prints:

{"responseData": {"results":[{"GsearchResultClass":"GwebSearch","unescapedUrl":"http://www.speedtest.net/","url":"http://www.speedtest.net/","visibleUrl":"www.speedtest.net","cacheUrl":"http://www.google.com/search?q\u003dcache:M47_v0xF3m8J:www.speedtest.net","title":"Speedtest.net by Ookla - The Global Broadband Speed \u003cb\u003eTest\u003c/b\u003e","titleNoFormatting":"Speedtest.net by Ookla - The Global Broadband Speed Test","content":"\u003cb\u003eTest\u003c/b\u003e your Internet connection bandwidth to locations around the world with this \ninteractive broadband speed \u003cb\u003etest\u003c/b\u003e from Ookla."},

etc.

0人赞添加讨论(0) 举报

做自己的国王

3楼-- · 2019-01-12 01:24

Request Overview

The Google search request is a standard HTTP GET command. It includes a collection of parameters relevant to your queries. These parameters are included in the request URL as name=value pairs separated by ampersand (&) characters. Parameters include data like the search query and a unique CSE ID (cx) that identifies the CSE that is making the HTTP request. The WebSearch or Image Search service returns XML results in response to your HTTP requests.

First, you must get your CSE ID (cx parameter) at Control Panel of Custom Search Engine

Then, See the official Google Developers site for Custom Search.

There are many examples like this:

http://www.google.com/search?
  start=0
  &num=10
  &q=red+sox
  &cr=countryCA
  &lr=lang_fr
  &client=google-csbe
  &output=xml_no_dtd
  &cx=00255077836266642015:u-scht7a-8i

And there are explained the list of parameters that you can use.

0人赞添加讨论(0) 举报

傲

4楼-- · 2019-01-12 01:29

import requests 
from bs4 import BeautifulSoup

headers_Get = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:49.0) Gecko/20100101 Firefox/49.0',
        'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
        'Accept-Language': 'en-US,en;q=0.5',
        'Accept-Encoding': 'gzip, deflate',
        'DNT': '1',
        'Connection': 'keep-alive',
        'Upgrade-Insecure-Requests': '1'
    }


def google(q):
    s = requests.Session()
    q = '+'.join(q.split())
    url = 'https://www.google.com/search?q=' + q + '&ie=utf-8&oe=utf-8'
    r = s.get(url, headers=headers_Get)

    soup = BeautifulSoup(r.text, "html.parser")
    output = []
    for searchWrapper in soup.find_all('h3', {'class':'r'}): #this line may change in future based on google's web page structure
        url = searchWrapper.find('a')["href"] 
        text = searchWrapper.find('a').text.strip()
        result = {'text': text, 'url': url}
        output.append(result)

    return output

Will return an array of google results in {'text': text, 'url': url} format. Top result url would be google('search query')[0]['url']

0人赞添加讨论(0) 举报

▲ chillily

5楼-- · 2019-01-12 01:42

input:

import requests

def googleSearch(query):
    with requests.session() as c:
        url = 'https://www.google.co.in'
        query = {'q': query}
        urllink = requests.get(url, params=query)
        print urllink.url

googleSearch('Linkin Park')

output:

https://www.google.co.in/?q=Linkin+Park

0人赞添加讨论(0) 举报

google search with python requests library

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间