Who can decode this code?

2019-03-28 10:54发布

问题:

Here are a few samples of strange code I see in our access logs. Can anyone decode this?

For example:

\xb3\xe1\xdd=H\t\xd5\xd2\xf0ml\xf1\x10\xee/\xa0$\xeaY\xa5\xe7\x81d \xd5\x1f\xd9 QI\xd9\'\xfb4I\xb8\xf3\x1d0:\xb5i\x18Q\x02\xa5\x10$\xdd\xcf\xfa\xc2\xfa\x15\xd0\xa8\xa5\xfc\xb2\xda\xb9\x9bA_\x89\xc4~\x0e\x0ebg*>\x18\x12\x9aniA\xf6\xfc\x85%]\x1d\xa6\x16\xfe\x96\x13\xe1\xd8\xb2\xf3i~\xde\xec6\xdbgW\xc3c\xac2\x7f\x9f&\xa5\xce\x14B8~8\xbe\xff1\xa8\xe6\x9a\x9d\xf7 \x14\x10\x9d\xce\xda\x06\x93r\xe7\x86\x98\xa1\x85^\xfa\x93\xf1\x94G\x95\xc0\x1b\xc9\x81\xcb<\x04/\x836E\x85\xbd\xae%\x07D\xe9j\x80\x7f=\xccWW\x04.\xbe\x0f\xb6\x8c

Now, if we leave out all the unreadable characters we get:

=H\tml/$Yd  QI'4I0:iQ$A_~bg*>niA%]i~6gWc2&B8~81 r^G</6E%Dj=WW.

The "H\tml" part in the beginning could suggest that the code above contains some HTML code, or it may just be a coincidence?

Here are a few more samples:

\xbdl\x1cq\x1e\xf65\xe3@3\xd8E\xa8\xf7\xc0e\x10\xfe\x15\xbfzhap\xff\xe6i\x9cq\xe3bGm\x81DWQ\xf5\x94\xbav~\\\xaa\xd0\xed\xdfl\x028\x1d\xcds\x07H\x02\x04\xf2\x8fU\xe0\xd6x,\x9f\x98)\xe8\x1c \xc7\xdd\xd7\xea\xd0\x12h^\xb4\xd0\x85G\xdb\xe4 \xe6\xabYM\xf36\"<\xb6\x1e\xeak]\x93\xc2D\xfa\xc4\xe9\xa93,b\xf5\x80\x15\x92L5\x02\xc3GY\xa7k\x7f\xa2\xfd}\xa2%+\x14\xf5\xe8\x95\x1f\xe2\xef\xd41

st|]%Y\xbf\xeaj\xe9<z\xbb\xfb\xe76\xbbf>\xe9\x1dU{\xaf\x97\x1b\x9e\xf3&\x9b\x87t{\xf3O0\x8c`TQ\xdc\xbd.\xee\xff\x9cEG\xabU\xc5 \xfc[\xe0\x0f\xa5jK\x85\x92\xb2\x90\x96E\xba\x9c\x9c\xa5\xccA`\v\xa0\xd7>3\t\x89u\x11\x817\xa5\xb2\x83\xfa\x89A\x14\x07\xe1\xc4>\"\xb4\x02m\xe4\x9eZ\x9b>\xb0\xe5\x9c\x15\xa0p\xado:\xb4\x1d\x1a\xb7\xb1\x1c\x0f\xa3\xadz-\xdc\xb5q\xb9\xfc\xb95g\xb8\xa8 \xd2t\xa3\x90\xe7N\xa7e \x15I\xe6\x1b\xdbNB5\xfa3\xed\xfdG\t\x19(\xe1\x9f

wo\x01\xb9\x98\xa6q.\x0c&\xba\x1dnXN\xce\xb7\xd3\x99\xfd\x12>*\xa5\x89\xc9\xb2 lQ\x89\xcc\x9f\x113+\xb5\xc4\x86\xb6g\x97\x15]\x98g\xc1\xa1\xa8\xfeK\x03\xb5w\xe4\xf8&\xc8`1\x8c\x1c\x88\x82\xc2]\x8d&\xbc\x8cU&4\xc5[jS \xb0\xed\xf7m{\x95i

\xbdl\x1cq\x1e\xf65\xe3@3\xd8E\xa8\xf7\xc0e\x10\xfe\x15\xbfzhap\xff\xe6i\x9cq\xe3bGm\x81DWQ\xf5\x94\xbav~\\\xaa\xd0\xed\xdfl\x028\x1d\xcds\x07H\x02\x04\xf2\x8fU\xe0\xd6x,\x9f\x98)\xe8\x1c \xc7\xdd\xd7\xea\xd0\x12h^\xb4\xd0\x85G\xdb\xe4 \xe6\xabYM\xf36\"<\xb6\x1e\xeak]\x93\xc2D\xfa\xc4\xe9\xa93,b\xf5\x80\x15\x92L5\x02\xc3GY\xa7k\x7f\xa2\xfd}\xa2%+\x14\xf5\xe8\x95\x1f\xe2\xef\xd41

We see such codes often in the logs. Like millions times a day. Makes me very curious about its contents :))

(more) code also available via http://pastebin.com/ZcXM5NHs

回答1:

This is definitely tring to exploit a supposed buffer overflow vulnerability in your server. I guess it is X86 code. You can decode them in php for example:

<?php echo("\xbdl\x1cq\x1e\xf65\xe3@3...");

If you put the output to a file, you can open it in a disassebler, and see the assembler insructions. Alhough I don't think you get any valuable information by looking at them.

These are sweep attacks, there is a little chance for someone tring to attack explicitly your server.



回答2:

This is for decoding back into binary. (Note: the list of backslash escapes could be incomplete. I just typed in the usual suspects)

#include <stdio.h>
#include <string.h>

int main(void)
{
char buff[2000] ;
size_t len, pos;
int ch;
unsigned val;

while (fgets(buff, sizeof buff, stdin)) {
        len = strlen(buff);
        while (len && buff[len-1] == '\n') buff[--len] = 0;
        for(pos=0; pos < len; pos++) {
                ch = buff[pos];
                if (ch != '\\') { putc( ch, stdout; continue; }
                switch ( ch = buff[++pos] ) {
                case '\\':
                case '\'':
                case '"':  putc(ch,stdout); break;
                case 't':  putc('\t',stdout); break;
                case 'n':  putc('\n',stdout); break;
                case 'r':  putc('\r',stdout); break;
                case 'a':  putc('\a',stdout); break;
                case 'v':  putc('\v',stdout); break;
                case 'b':  putc('\b',stdout); break;
                case ' ':  putc(' ',stdout); break;
                case 'x':
                        ch = buff[++pos];
                        if (ch >= 'a') val = 10 + (ch -'a');
                        else if (ch >= 'A') val = 10 + (ch -'A');
                        else if (ch >= '0') val = (ch -'0');
                        val <<= 4;
                        ch = buff[++pos];
                        if (ch >= 'a') val += 10 + (ch -'a');
                        else if (ch >= 'A') val += 10 + (ch -'A');
                        else if (ch >= '0') val += (ch -'0');
                        putc(val, stdout);
                        break;
                default:
                        putc(ch, stdout);
                        break;
                        }
                }
        }

return 0;
}

The bad news is: the supllied strings don't seem to yield valid x86 code. It may have been crypted, with a decript/bootstrap at the end; near the overflow part. Disclaimer: I am not an assembly expert.



回答3:

Let's have a look at the first part:

\xb3\xe1\xdd=H\t\xd5\xd2\xf0ml\xf1\x10

The escape codes in the form \xb3 are hexadecimal codes for 8 bit integers. In this case it is the code for 179.

The escape code \t is the tab character.

The "H\t" is just an H (= 72) followed by a tab character (= 9). It is not Ht and is not related to HTML.

I suspect that it is someone sending data to your webserver in an attempt to exploit a vulnerability. You should make sure that your webserver is fully updated to prevent the exploit from working.



回答4:

My first guess is that \x starts an escape sequence using two hex characters. So try replacing \xAB with the character corresponding to the hex AB.

\t is pobably a tab, and \' an escaped '



回答5:

Trying to reverse engineer binary is a very painful process that is near impossible unless you know what the contents should be in the first place. This is because such files often contain headers that instruct the program that runs the logs on how to decode them. For example - the exact bit where the data starts, and what bit represents what data, and whether the data is float, or double, or int, and what endian format the data is stored in.

You should probably spend your time working out what program wrote the log, and use it to convert it back to ascii - or be able to hunt in some docs for the format of the binary logs