When you ask a question or request the definition of a word in a Google search, Google gives you a summary of the answer in the "feedback" box.
For example, when you search for define apple
you get this result:
Now, I would like to make it clear that I do not need the entire page or the other results, I just need this box:
How can I use the Requests
and Beautiful Soup
modules to get the contents of this "feedback" box in Python 3?
If that is not possible can I use the Google Search Api to get the contents of the "feedback" box?
I have found a similar question on SO but the OP has not specified the language, there are no answers and I fear that the two comments are outdated as this question was asked nearly 9 months ago.
Thank you for your time & help in advance.
Question is nice idea
program can be started with
python3 defineterm.py apple
#! /usr/bin/env python3.5
# defineterm.py
import requests
from bs4 import BeautifulSoup
import sys
import html
import codecs
searchterm = ' '.join(sys.argv[1:])
url = 'https://www.google.com/search?q=define+' + searchterm
res = requests.get(url)
try:
res.raise_for_status()
except Exception as exc:
print('error while loading page occured: ' + str(exc))
text = html.unescape(res.text)
soup = BeautifulSoup(text, 'lxml')
prettytext = soup.prettify()
#next lines are for analysis (saving raw page), you can comment them
frawpage = codecs.open('rawpage.txt', 'w', 'utf-8')
frawpage.write(prettytext)
frawpage.close()
firsttag = soup.find('h3', class_="r")
if firsttag != None:
print(firsttag.getText())
print()
#second tag may be changed, so check it if not returns correct result. That might be situation for all searched tags.
secondtag = soup.find('div', {'style': 'color:#666;padding:5px 0'})
if secondtag != None:
print(secondtag.getText())
print()
termtags = soup.findAll("li", {"style" : "list-style-type:decimal"})
count = 0
for tag in termtags:
count += 1
print( str(count)+'. ' + tag.getText())
print()
make script as executable
then in ~/.bashrc
this line can be added
alias defterm="/data/Scrape/google/defineterm.py "
putting correct path to script your place
then executing
source ~/.bashrc
program can be started with:
defterm apple (or other term)
It is easily done using requests and bs4, you just need to pull the text from the div with the class lr_dct_ent
import requests
from bs4 import BeautifulSoup
h = {"User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36"}
r = requests.get("https://www.google.ie/search?q=define+apple", headers=h).text
soup = BeautifulSoup(r)
print("\n".join(soup.select_one("div.lr_dct_ent").text.split(";")))
The main text is in an ordered list, the noun is in the div with the lr_dct_sf_h class:
In [11]: r = requests.get("https://www.google.ie/search?q=define+apple", headers=h).text
In [12]: soup = BeautifulSoup(r,"lxml")
In [13]: div = soup.select_one("div.lr_dct_ent")
In [14]: n_v = div.select_one("div.lr_dct_sf_h").text
In [15]: expl = [li.text for li in div.select("ol.lr_dct_sf_sens li")]
In [16]: print(n_v)
noun
In [17]: print("\n".join(expl))
1. the round fruit of a tree of the rose family, which typically has thin green or red skin and crisp flesh.used in names of unrelated fruits or other plant growths that resemble apples in some way, e.g. custard apple, oak apple.
used in names of unrelated fruits or other plant growths that resemble apples in some way, e.g. custard apple, oak apple.
2. the tree bearing apples, with hard pale timber that is used in carpentry and to smoke food.