Send DHT queries to “router.bittorrent.com” respon

2020-04-23 04:58发布

问题:

I read the DHT Protocol in bep_0005 page.

But when I send a ping query or a find_node query, the server response a garbled text (both of router.bittorrent.com:6881 or dht.transmissionbt.com:6881)

Here is the Java source code bellow

    public String ping(final String id) {
    System.out.println("Start ping:" + id);
    Bencode bencode = new Bencode();
    byte[] encoded = bencode.encode(new HashMap<Object, Object>() {
        private static final long serialVersionUID = 4225164001818744013L;

        {
            put("t", "tr");
            put("y", "q");
            put("q", "ping");
            put("a", new HashMap<Object, Object>() {
                private static final long serialVersionUID = -6092073963971093460L;

                {
                    put("id", id);
                }
            });
        }
    });
    byte[] result = client.send(new String(encoded, bencode.getCharset()));
    Map<String, Object> dict = bencode.decode(result, Type.DICTIONARY);
    System.out.println("Bdecoded Data:" + dict);
    return "";
}

Send Packets

ping Query = {"t":"aa", "y":"q", "q":"ping", "a":{"id":"abcdefghij0123456789"}}

bencoded = d1:ad2:id20:abcdefghij0123456789e1:q4:ping1:t2:aa1:y1:qe

Acrodding to the bep_0005 protocol the response with be like:

Response = {"t":"aa", "y":"r", "r": {"id":"mnopqrstuvwxyz123456"}}

bencoded = d1:rd2:id20:mnopqrstuvwxyz123456e1:t2:aa1:y1:re

But my response is:

Response = {ip=��P���, r={id=2�NisQ�J�)ͺ����F|�g}, t=tr, y=r}

bencoded = d2:ip6:��P���1:rd2:id20:2�NisQ�J�)ͺ����F|�ge1:t2:tr1:y1:re

Send udp part Java code is:

    public byte[] send(String sendData) {
    DatagramSocket client;
    try {
        client = new DatagramSocket();
        client.setSoTimeout(5000);
        byte[] sendBuffer;
        sendBuffer = sendData.getBytes();
        InetAddress addr = InetAddress.getByName("router.bittorrent.com");
        int port = 6881;
        DatagramPacket sendPacket = new DatagramPacket(sendBuffer, sendBuffer.length, addr, port);
        client.send(sendPacket);
        byte[] receiveBuf = new byte[512];
        DatagramPacket receivePacket = new DatagramPacket(receiveBuf, receiveBuf.length);
        client.receive(receivePacket);
        System.out.println("Client Source Data:" + Arrays.toString(receivePacket.getData()));
        String receiveData = new String(receivePacket.getData(), "UTF-8");
        System.out.println("Client String Data:" + receiveData);
        client.close();
        return receivePacket.getData();
    } catch (SocketException e) {
        e.printStackTrace();
    } catch (UnknownHostException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }
    return null;
}

read the response in UTF-8 , but the iso-8859-1 is also a garbled text.

who can help me,thanks!

回答1:

the server response a garbled text

No, the response is bencoded and contains raw binary data.
It CAN NOT be treated as text.

In BEP5, to make the raw binary node_id in the examples printable, it has cleverly been chosen to consist of only alphanumeric characters.
See:
Bittorrent KRPC - Why are node ID's half the size of an info_hash and use every character a-z?

The ip key is a extension explained in: BEP42 - DHT Security extension

The received response is fully valid.

TODO: Working Java code


回答2:

Map<String, Object> dict = bencode.decode(result, Type.DICTIONARY);

This gives you the decoded root dictionary of the message as Map. Within that you should find the r dictionary as another map and with in that map the id value. What type the id has will depend on the bedecoding library you are using.

If it is ByteBuffer or byte[] then you should have 20 bytes that you can hexencode (to 40 characters) if you need it to be human-readable. The DHT protocol deals in raw hashes, not hex values.

If it is a String then you will have to convert the string back into byte[] before hex-encoding it. That is only possible when the bdecoder used ISO 8859-1 to decode because that charset is roundtrip-safe while utf-8 is not for arbitrary byte sequences.