Connecting to .onion network with python

2019-07-21 02:30发布

问题:

I want make python to get into .onion sites from console, below example can use tor in python but when i try to connect to .onion sites it gives error such as "Name or service not known", how do i fix this ?

Sample Code:

import socket
import socks
import httplib

def connectTor():
    socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5,"127.0.0.1",9050,True)
    socket.socket = socks.socksocket
    print "Connected to tor"

def newIdentity():
    HOST = '127.0.0.1'
    socks.setdefaultproxy()
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.connect((HOST,9051))
    s.send("AUTHENTICATE\r\n")
    response = s.recv(128)
    if response.startswith("250"):
        s.send("SIGNAL NEWNYM\r\n"),
    s.close()
    connectTor()

def readPage(page):
    conn = httplib.HTTPConnection(page)
    conn.request("GET","/")
    response = conn.getresponse()
    print (response.read())

def main():
    connectTor()
    print "Tor Ip Address :"
    readPage("my-ip.heroku.com")
    print "\n\n"
    readPage("od6j46sy5zg7aqze.onion")
    return 0

if __name__ == '__main__':
    main()

回答1:

I think this is your problem, but I may be wrong.

You're relying on monkeypatching socket.socket to force HTTPConnection to use your SOCKS5 proxy to talk to TOR. But HTTPConnection calls socket.create_connection, which in turns calls socket.getaddrinfo to resolve the name before calling socket.socket to create the socket. And getaddrinfo doesn't use socket. So, it's not patched, so it's not talking to your SOCKS5 proxy, so it's using your default name resolver.

This works fine for proxying connections to normal internet hosts, because TOR is going to return the same DNS result for "my-ip.heroku.com" as your normal name resolver. But it won't work for "od6j46sy5zg7aqze.onion", because there is no .onion TLD in your normal name resolver.

If you're curious, you can see the source to HTTPConnection.connect, socket.create_connection, and getaddrinfo (the last in C, and scattered throughout the module depending on your platform).

So, how do you solve this? Well, looking at two of the SOCKS5 modules that are called socks, one has a function that could be directly monkeypatched in place of create_connection (its API is not identical, but it's close enough for what HTTPConnection needs); the other doesn't, but you could pretty easily write one (just call socks.socksocket and then call its connect method). Or you could modify HTTPConnection to create a socket.socket and call its connect method.

Finally, you may be wondering why most of the different socks modules have a setdefaultproxy function that with a parameter named remote_dns that specifically claims it causes DNS resolving to be performed remotely, when that doesn't actually work. Well, it does work if you use a socks.socksocket, but it can't possibly work if you use socket.getaddrinfo.

By the way, if you haven't read DnsResolver and TorifyHOWTO, read them before going any further, because just trying to slap together code that works without knowing why it works is almost guaranteed to lead to you (or your users) leaking information when you thought you were being anonymous.



回答2:

You can add port 80 to the onion address to avoid DNS look up. e.g. readPage("od6j46sy5zg7aqze.onion:80")

with urllib2 you need to specify also the protocol (i.e. http) e.g.

import urllib2

print urllib2.urlopen("http://od6j46sy5zg7aqze.onion:80").read()



标签: python proxy tor