I am running Selenium and PhantomJS to input search terms into a website and retrieve the number of hits for each search term. I have to do this 130,000+ times, so the code has been running nicely for a day until suddenly the program broke with the following error:
Traceback (most recent call last):
File "CBBPlyNwsScrape.py", line 82, in <module>
browser = webdriver.PhantomJS()
File "/Library/Python/2.7/site-packages/selenium/webdriver/phantomjs/webdriver.py", line 50, in __init__
self.service.start()
File "/Library/Python/2.7/site-packages/selenium/webdriver/phantomjs/service.py", line 69, in start
raise WebDriverException("Can not connect to GhostDriver")
selenium.common.exceptions.WebDriverException: Message: 'Can not connect to GhostDriver'
I'm running this on Mac OSX and Python 2.7.3. I have the latests versions of Selenium and PhantomJS installed. Can anyone tell me what is going on and why GhostDriver was working fine for so long and suddenly stopped?
In the ghostdriver.log
file, this is all it contains:
PhantomJS is launching GhostDriver...
[ERROR - 2013-12-01T05:14:34.491Z] GhostDriver - Main - Could not start Ghost Driver => {
"message": "Could not start Ghost Driver",
"line": 82,
"sourceId": 4445044288,
"sourceURL": ":/ghostdriver/main.js",
"stack": "Error: Could not start Ghost Driver\n at :/ghostdriver/main.js:82",
"stackArray": [
{
"sourceURL": ":/ghostdriver/main.js",
"line": 82
}
]
}
Thanks
Installing latest phantom js fixed this error, this was happening with default ubuntu 12.04 phantomjs destro
I was having the same problem. I don't know why the program has trouble calling the phantomJS webdriver, but the answer is to write a simple exception WebDriverException. This following code did the trick for me
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import Select
from selenium.common.exceptions import NoSuchElementException, WebDriverException
import unittest, time, re, urllib2
f = open("mother.txt","r") #opens file with name of "test.txt"
l = "1"
m = "2"
n = "3"
aTuple = ( l, m, n ) # create tuple
e = int(0)
for line in f:
e += 1
try:
h = str(e)
j = line
g = open("yes4/" + h + ".txt","w") #opens file with name of "test.txt"
for item in aTuple:
driver = webdriver.PhantomJS('phantomjs')
base_url = j + item
verificationErrors = []
accept_next_alert = True
driver.get(base_url)
elem=driver.find_element_by_id("yelp_main_body")
source_code=elem.get_attribute("outerHTML").encode('utf-8').strip()
g.write(source_code)
driver.quit()
except WebDriverException:
print "e"
h = str(e)
j = line
g = open("yes4/" + h + ".txt","w") #opens file with name of "test.txt"
for item in aTuple:
driver = webdriver.PhantomJS('phantomjs')
base_url = j + item
verificationErrors = []
accept_next_alert = True
driver.get(base_url)
elem=driver.find_element_by_id("yelp_main_body")
source_code=elem.get_attribute("outerHTML").encode('utf-8').strip()
g.write(source_code)
driver.quit()
else:
print h