I am trying to use selenium/phantomjs with scrapy and I'm riddled with errors. For example, take the following code snippet:
def parse(self, resposne):
while True:
try:
driver = webdriver.PhantomJS()
# do some stuff
driver.quit()
break
except (WebDriverException, TimeoutException):
try:
driver.quit()
except UnboundLocalError:
print "Driver failed to instantiate"
time.sleep(3)
continue
A lot of the times the driver it seems it has failed to instantiate (so the driver
is unbound, hence the exception), and I get the blurb (along with the print message I put in)
Exception AttributeError: "'Service' object has no attribute 'process'" in <bound method Service.__del__ of <selenium.webdriver.phantomjs.service.Service object at 0x7fbb28dc17d0>> ignored
Googling around, it seems everyone suggests updating phantomjs, which I have (1.9.8
built from source). Would anyone know what else could be causing this problem and a suitable diagnosis?
The reason for this behavior is how the PhantomJS driver's Service
class is implemented.
There is a __del__
method defined that calls self.stop()
method:
def __del__(self):
# subprocess.Popen doesn't send signal on __del__;
# we have to try to stop the launched process.
self.stop()
And, self.stop()
is assuming the service instance is still alive trying to access it's attributes:
def stop(self):
"""
Cleans up the process
"""
if self._log:
self._log.close()
self._log = None
#If its dead dont worry
if self.process is None:
return
...
The same exact problem is perfectly described in this thread:
- Python attributeError on __del__
What you should do is to silently ignore AttributeError
occurring while quitting the driver instance:
try:
driver.quit()
except AttributeError:
pass
The problem was introduced by this revision. Which means that downgrading to 2.40.0
would also help.
I had that problem because phantomjs was not available from script (was not in path).
You can check it by running phantomjs in console.
Selenium version 2.44.0 on pypi needs the following patch in Service.__init__
of selenium.webdriver.common.phantomjs.service
self.process = None
I was thinking of submitting a patch but this already exists in the most recent version on google code.