I have scrapy and scrapy-splash set up on a AWS Ubuntu server. It works fine for a while, but after a few hours I'll start getting error messages like this;
Traceback (most recent call last):
File "/home/ubuntu/.local/lib/python3.5/site-
packages/twisted/internet/defer.py", line 1384, in _inlineCallbacks
result = result.throwExceptionIntoGenerator(g)
File "/home/ubuntu/.local/lib/python3.5/site-
packages/twisted/python/failure.py", line 393, in throwExceptionIntoGe
nerator
return g.throw(self.type, self.value, self.tb)
File "/home/ubuntu/.local/lib/python3.5/site-
packages/scrapy/core/downloader/middleware.py", line 43, in process_re
quest
defer.returnValue((yield download_func(request=request,spider=spider)))
twisted.internet.error.ConnectionRefusedError: Connection was refused by
other side: 111: Connection refused.
I'll find that the splash process in docker has either terminated, or is unresponsive.
I've been running the splash process with;
sudo docker run -p 8050:8050 scrapinghub/splash
as per the scrapy-splash instructions.
I tried starting the process in a tmux shell to make sure the ssh connection is not interfering with the splah process, but no luck.
Thoughts?