A different question, i.e. Best .NET obfuscation tools/strategy, asks whether obfuscation is easy to implement using tools.
My question though is, is obfuscation effective? In a comment replying to this answer, someone said that "if you're worried about source theft ... obfuscation is almost trivial to a real cracker".
I've looked at the output from the Community Edition of Dotfuscator: and it looks obfuscated to me! I wouldn't want to maintain that!
I understand that simply 'cracking' obfuscated software might be relatively easy: because you only need to find whichever location in the software implements whatever it is you want to crack (typically the license protection), and add a jump to skip that.
If the worry is more than just cracking by an end-user or a 'pirate' though: if the worry is "source theft" i.e. if you're a software vendor, and your worry is another vendor (a potential competitor) reverse-engineering your source, which they could then use in or add to their own product ... to what extent is simple obfuscation an adequate or inadequate protection against that risk?
1st edit:
The code in question is about 20 KLOC which runs on end-user machines (a user control, not a remote service).
If obfuscation really is "almost trivial to a real cracker", I'd like some insight into why it's ineffective (and not just "how much" it's not effective).
2nd edit:
I'm not worried about someone's reversing the algorithm: more worried about their repurposing the actual implementation of the algorithm (i.e. the source code) into their own product.
Figuring that 20 KLOC is several month's work to develop, would it take more or less than this (several months) to deobfuscate it all?
Is it even necessary to deobfuscate something in order to 'steal' it: or might a sane competitor simply incorporate it wholesale into their product while still obfuscated, accept that as-is it's a maintenance nightmare, and hope that it needs little maintenance? If this scenario is a possibility then is obfuscated .Net code any more vulnerable to this than compiled machine code is?
Is most of the obfuscation "arms race" aimed mostly at preventing people people from even 'cracking' something (e.g. finding and deleting the code fragment which implements licensing protection/enforcement), more than at preventing 'source theft'?
I've discussed why I don't think Obfuscation is an effective means of protection against cracking here:
Protect .NET Code from reverse engineering
However, your question is specifically about source theft, which is an interesting topic. In Eldad Eiliams book, "Reversing: Secrets of Reverse Engineering", the author discusses source theft as one reason behind reverse engineering in the first two chapters.
Basically, what it comes down to is the only chance you have of being targeted for source theft is if you have some very specific, hard to engineer, algorithm related to your domain that gives you a leg up on your competition. This is just about the only time it would be cost-effective to attempt to reverse engineer a small portion of your application.
So, unless you have some top-secret algorithm you don't want your competition to have, you don't need to worry about source theft. The cost involved with reversing any significant amount of source-code out of your application quickly exceeds the cost of re-writing it from scratch.
Even if you do have some algorithm you don't want them to have, there isn't much you can do to stop determined and skilled individuals from getting it anyway (if the application is executing on their machine).
Some common anti-reversing measures are:
However, packers can be unpacked, and obfuscation doesn't really hinder those who want to see what you application is doing. If the program is run on the users machine then it is vulnerable.
Eventually its code must be executed as machine code and it is normally a matter of firing up debugger, setting a few breakpoints and monitoring the instructions being executed during the relevant action and some time spent poring over this data.
You mentioned that it took you several months to write ~20kLOC for your application. It would take almost an order of magnitude longer to reverse those equivalent 20kLOC from your application into workable source if you took the bare minimum precautions.
This is why it is only cost-effective to reverse small, industry specific algorithms from your application. Anything else and it isn't worth it.
Take the following fictionalized example: Lets say I just developed a brand new competing application for iTunes that had a ton of bells and whistles. Let say it took several 100k LOC and 2 years to develop. One key feature I have is a new way of serving up music to you based off your music-listening taste.
Apple (being the pirates they are) gets wind of this and decides they really like your music suggest feature so they decide to reverse it. They will then hone-in on only that algorithm and the reverse engineers will eventually come up with a workable algorithm that serves up the equivalent suggestions given the same data. Then they implement said algorithm in their own application, call it "Genius" and make their next 10 trillion dollars.
That is how source theft goes down.
No one would sit there and reverse all 100k LOC to steal significant chunks of your compiled application. It would simply be too costly and too time consuming. About 90% of the time they would be reversing boring, non-industry-secretive code that simply handled button presses or handled user input. Instead, they could hire developers of their own to re-write most of it from scratch for less money and simply reverse the important algorithms that are difficult to engineer and that give you an edge (ie, music suggest feature).
You are worried about people stealing the specific algorithms used in your product. Either you are Fair Isaac or you need to differentiate yourself using more than the way you x++;. If you solved some problem in code that cannot be solved by someone else puzzling over it for a few hours, you should have a PhD in computer science and/or patents to protect your invention. 99% of software products are not successful or special because of the algorithms. They are successful because their authors did the heavy lifting to put together well-known and easily understood concepts into a product that does what their customers need and sell it for cheaper than it would cost to pay others to re-do the same.
Most people tend to write what appears to be obfuscated code and that hasn't stopped the crackers so what's the difference?
EDIT:
Ok, serious time. If you really want to make something that's hard to break, look into polymorphic coding (not to be confused with polymorphism). Make code that is self-mutating, and it is a serious pain to break and will keep them guessing.
http://en.wikipedia.org/wiki/Polymorphic_code
In the end, nothing is impossible to reverse engineer.
If you have IP in code which must be protected at all costs, then you should make your software's functionality available as a service, on a secured remote server.
Good obfuscation will protect you up to a point, but it's all about the amount of effort required to break it against the 'reward' of having the code. If you are talking about stopping your average business user, then a commercial obfuscator should be sufficient.
Look at it this way; the WMD editor that you typed your question into was reverse engineered by the SO team in order to fix some bugs and make som enhancements. That code was obfuscated. You are never going to stop intelligent motivated people from hacking your code, the best you can hope for is to keep the honest people honest and make it somewhat hard to break.
Short answer is yes and no; it depends entirely on what you are trying to prevent. Section twelve of Secure Programming Cookbook has some interesting comments on this on page 653 (which is conveniently unavailable in google books preview). It classifies anti-tampering into four categories: Zero day (slowing down an attacker so it takes them a long time to accomplish what they want), protection of a proprietary algorithm to prevent reverse engineering, "because I can" attacks and I can't remember the 4th one. You have to ask what am I trying to prevent, and if you are really concerned about an individual getting a look at your source code then obfuscation has some value. Used on it's own it's usually just an annoyance to someone attempting to mess with your application and like any good security measure it works best when used in combination with other anti-tampering techniques.