I was wondering how to modify byte code, then recompile that code so I can use it in python as a function? I've been trying:
a = """
def fact():
a = 8
a = 0
"""
c = compile(a, '<string>', 'exec')
w = c.co_consts[0].co_code
dis(w)
which decompiles to:
0 LOAD_CONST 1 (1)
3 STORE_FAST 1 (1)
6 LOAD_CONST 2 (2)
9 STORE_FAST 1 (1)
12 LOAD_CONST 0 (0)
15 RETURN_VALUE
supposing I want to get rid of lines 0 and 3, I call:
x = c.co_consts[0].co_code[6:16]
dis(x)
which results in :
0 LOAD_CONST 2 (2)
3 STORE_FAST 1 (1)
6 LOAD_CONST 0 (0)
9 RETURN_VALUE
my problem is what to do with x
, if I try exec x
I get an 'expected string without nullbytes and I get the same for exec w
,
trying to compile x
results in: compile() expected string without null bytes.
I'm not sure what the best way to proceed, except maybe I need to create some kind of code-object, but I'm not sure how, but I'm assuming it must be possible aka byteplay, python assemblers et al
I'm using python 2.7.10, but I'd like it to be future compatible (Eg python 3) if it's possible.
Update: For sundry reasons I have started writing a Cross-Python-version assembler. See https://github.com/rocky/python-xasm It is still in very early beta.
As far as I know there is no currently-maintained Python assembler. PEAK's Bytecode Disassembler was developed for Python 2.6, and later modified to support early Python 2.7.
It is pretty cool from the documentation. But it relies on other PEAK libraries which might be problematic.
I'll go through the whole example to give you a feel for what you'd have to do. It is not pretty, but then you should expect that.
Basically after modifying the bytecode, you need to create a new
types.CodeType
object. You need a new one because many of the objects in the code type, for good reason, you can't change. For example the interpreter may have some of these object values cached.After creating code, you can use this in functions that use a code type which can be used in
exec
oreval
.Or you can write this to a bytecode file. Alas the code format has changed between Python 2 and Python 3. And by the way so has the optimization and bytecodes. In fact, in Python 3.6 they will be word codes not bytecodes.
So here is what you'd have to do for your example:
When I ran this here is what I got:
Notice that the line numbers haven't changed even though I removed in code a couple of lines. That is because I didn't update
fn_code.co_lnotab
.If you want to now write a Python bytecode file from this. Here is what you'd do:
To simplify writing the boilerplate bytecode above, I've added a routine to xdis called write_python_file().
Now to check the results:
An alternate approach for optimization is to optimize at the Abstract Syntax Tree level (AST). I don't know how you'd generate a bytecode file from an AST. So I suppose you write this back out as Python source, if that is possible.
Note however that some kinds of optimization like tail-recursion elimination might leave bytecode in a form that it can't be transformed in a truly faithful way to source code. See my pycon2018 columbia lightning talk for a video I made which elminates tail recursion in bytecode to get an idea of what I'm talking about here.