let us assume that there is a big, commercial project (a.k.a Project), which uses Python under the hood to manage plugins for configuring new control surfaces which can be attached and used by Project.
There was a small information leak, some part of the Project's Python API leaked to the public information and people were able to write Python scripts which were called by the underlying Python implementation as a part of Project's plugin loading mechanism.
Further on, using inspect
module and raw __dict__
readings, people were able to find out a major part of Project's underlying Python implementation.
Is there a way to keep the Python secret codes secret?
Quick look at Python's documentation revealed a way to suppres a import of inspect
module this way:
import sys
sys.modules['inspect'] = None
Does it solve the problem completely?
There is no way to keep your application code an absolute secret.
Frankly, if a group of dedicated and determined hackers (in the good sense, not in the pejorative sense) can crack the PlayStation's code signing security model, then your app doesn't stand a chance. Once you put your app into the hands of someone outside your company, it can be reverse-engineered.
Now, if you want to put some effort into making it harder, you can compile your own embedded python executable, strip out unnecessary modules, obfuscate the compiled python bytecode and wrap it up in some malware rootkit that refuses to start your app if a debugger is running.
But you should really think about your business model. If you see the people who are passionate about your product as a threat, if you see those who are willing to put time and effort into customizing your product to personalize their experience as a danger, perhaps you need to re-think your approach to security. Assuming you're not in the DRM business, or have a similar model that involves squeezing money from reluctant consumers, consider developing an approach that involves sharing information with your users, and allowing them to collaboratively improve your product.
You cannot fully prevent reverse engineering of software - if it comes down to it, one can always analyze the assembler instructions your program consists of.
You can, however, significantly complicate the process, for example by messing with Python internals. However, before jumping to how to do it, I'd suggest you evaluate whether to do it. It's usually harder to "steal" your code (one needs to fully understand them to be able to extend them, after all) than code it oneself. A pure, unobfuscated Python plugin interface, however, can be vital in creating a whole ecosystem around your program, far outweighing the possible downsides to having someone peek in your maybe not perfectly designed coding internals.
No, this does not solve the problem. Someone could just rename the inspect module to something else and import it.
What you're trying to do is not possible. The python interpreter must be able to take your bytecode and execute it. Someone will always be able to decompile the bytecode. They will always be able to produce an AST and view the flow of the code with variable and class names.
Note that this process can also be done with compiled language code; the difference there is that you will get assembly. Some tools can infer C structure from the assembly, but I don't have enough experience with that to comment on the details.
What specific piece of information are you trying to hide? Could you keep the algorithm server side and make your software into a client that touches your web service? Keeping the code on a machine you control is the only way to really keep control over the code. You can't hand someone a locked box, the keys to the box, and prevent them from opening the box when they have to open it in order to run it. This is the same reason DRM does not work.
All that being said, it's still possible to make it hard to reverse engineer, but it will never be impossible when the client has the executable.
No there is not.
Python is particularly easy to reverse engineer, but other languages, even compiled ones, are easy enough to reverse.