I want to open many url's (I open one url, search for all links on this webstie and open also them or download images etc. from this inks). So first I wanted to check if the url is correct, so I used an if
statement:
if not urlparse.urlparse(link).netloc:
return 'broken url'
But I noticed that some values did not pass this statement. I came across a website when a links looked like: //b.thumbs.redditmedia.com/7pTYj4rOii6CkkEC.jpg
, but I had an error:
ValueError: unknown url type: //b.thumbs.redditmedia.com/7pTYj4rOii6CkkEC.jpg
, but my if statement didn't catch that.
How can I check more precisely if an url works good?
Pretty simple:
You can also read the whole document by
Generally if you want to download all images from an HTML document, you can do something like this:
If you aren't specific about the library being used, you could do the following :
Check your url's:
Check by opening every single link:
Check an invalid url: