I'm trying to break a captcha
within a form from a website, but this captcha is dynamic, it doesn't have a URL instead it has something like this
src="captcha?accion=image"
What is the best option here? I have read something like using middlewares or something like that. Also I know it can be done with Selenium or Splash or another browser driver (screenshot), but i want to do it with just Scrapy
, if it's possible of course.
Here's a complete solution to bypass the specified
captcha
using anticaptcha and PIL.Due to the dynamic of this
captcha
, we need to grab a print screen of theimg
element containing thecaptcha
. For that we usesave_screenshot()
andPIL
to crop and save<img name="imagen"...
to disk (captcha.png
).We then submit
captcha.png
toanti-captcha
that will return the solution, i.e.:Output:
captcha.png
Notes:
anticaptcha
is a paid service (0.5$/1000 imgs);anticaptcha
.