As it says in the title, I am trying to access a url through several different proxies sequentially (using for loop). Right now this is my code:
import requests
import json
with open('proxies.txt') as proxies:
for line in proxies:
proxy=json.loads(line)
with open('urls.txt') as urls:
for line in urls:
url=line.rstrip()
data=requests.get(url, proxies={'http':line})
data1=data.text
print data1
and my urls.txt file:
http://api.exip.org/?call=ip
and my proxies.txt file:
{"https": "84.22.41.1:3128"}
{"http":"194.126.181.47:81"}
{"http":"218.108.170.170:82"}
that I got at [www.hidemyass.com][1]
for some reason, the output is
68.6.34.253
68.6.34.253
68.6.34.253
as if it is accessing that website through my own router ip address. In other words, it is not trying to access through the proxies I give it, it is just looping through and using my own over and over again. What am I doing wrong?
Directly copied from another answer of mine.
Well, actually you can, I've done this with a few lines of code and it works pretty well.
I use it like this:
It's simple, but actually works for me.
According to this thread, you need to specify the
proxies
dictionary as{"protocol" : "ip:port"}
, so your proxies file should look likeEDIT: You're reusing
line
for both URLs and proxies. It's fine to reuseline
in the inner loop, but you should be usingproxies=proxy
--you've already parsed the JSON and don't need to build another dictionary. Also, as abanert says, you should be doing a check to ensure that the protocol you're requesting matches that of the proxy. The reason the proxies are specified as a dictionary is to allow lookup for the matching protocol.There are two obvious problems right here:
First, because you have a
for line in urls:
inside thefor line in proxies:
,line
is going to be the current URL here, not the current proxy. And besides, even if you weren't reusingline
, it would be the JSON string representation, not the dict you decoded from JSON.Then, if you fix that to use
proxy
, instead of something like{'https': '83.22.41.1:3128'}
, you're passing{'http': {'https': '83.22.41.1:3128'}}
. And that obviously isn't a valid value.To fix both of those problems, just do this:
Meanwhile, what happens when you have an HTTPS URL, but the current proxy is an HTTP proxy? You're not going to use the proxy. So you probably want to add something to skip over them, like this: