How to return a string generated from a sha1 bytea

2020-04-28 06:07发布

first of all sorry for the bad English.

Well, I want to read the pieces hashes information from a torrent file. Currently, I'm using https://github.com/hyPiRion/java-bencode this bencode library to decode the information, but my problem is when I want to convert the string of pieces to a byte array. The torrent file is encoded in UTF-8. but If I do

 Byte[] bytepieces = piecestring.getBytes("UTF-8");

It gives well. anything really usable.

For other side, for comparing or try to get the string, instead of getting the bytes, I've read the first piece of my file, and calculate the sha1. After getting the 20 sized byte array of sha1 if I convert it to string, effectively, the string matches the first part of the big string of pieces... But well, If I try to return that generated string, to the 20 originally bytes that created it ... I can't... how to do this?

Little example:

FileInputStream fin = new FileInputStream("miFile");
byte[] array = new Byte[512*1024]; //a piece of 512 kb
fin.read(array,0,512*1024);
MessageDigest md = MessageDigest.getInstanse ("SHA);
Byte [ sha1byte = md.digest(array);
String s = new String(sha1byte,"UTF-8");

After doing this, sha1byte.length is 20, and is OK, the correct size of a sha1 hash. But if i do s.getBytes("UTF-8").length, in the case of my example i got... ¡33! ¡wuuut! I want to get again from the generated string my 20 arrays. How to can I get this?

Well thanks :P

2条回答
家丑人穷心不美
2楼-- · 2020-04-28 06:30

Thanks guys for your answer, but I can find the solution using this https://github.com/bedeho/bencodej

The lib loads the Bencode data alwais as bytearray with custom classes, and is able have a 1:1 with the bytestrings :p Thanks for all.

查看更多
Evening l夕情丶
3楼-- · 2020-04-28 06:32

I'm storing binary data as strings, because the BEncode format in .torrent files, store that binary data as string

Bencode "strings" are sequences of bytes, not sequences of unicode codepoints. Therefore a language's representation of bytes - byte[] or ByteBuffer in java - is appropriate and should only be interpreted as utf8 string in certain cases when they actually contain things that are supposed to be human-readable.

So you should use a bencoding library that supports extraction of the raw bytes.

查看更多
登录 后发表回答