Pony ORM does the nice trick of converting a generator expression into SQL. Example:
>>> select(p for p in Person if p.name.startswith('Paul'))
.order_by(Person.name)[:2]
SELECT "p"."id", "p"."name", "p"."age"
FROM "Person" "p"
WHERE "p"."name" LIKE "Paul%"
ORDER BY "p"."name"
LIMIT 2
[Person[3], Person[1]]
>>>
I know Python has wonderful introspection and metaprogramming builtin, but how this library is able to translate the generator expression without preprocessing? It looks like magic.
[update]
Blender wrote:
Here is the file that you're after. It seems to reconstruct the generator using some introspection wizardry. I'm not sure if it supports 100% of Python's syntax, but this is pretty cool. – Blender
I was thinking they were exploring some feature from the generator expression protocol, but looking this file, and seeing the ast
module involved... No, they are not inspecting the program source on the fly, are they? Mind-blowing...
@BrenBarn: If I try to call the generator outside the select
function call, the result is:
>>> x = (p for p in Person if p.age > 20)
>>> x.next()
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
File "<interactive input>", line 1, in <genexpr>
File "C:\Python27\lib\site-packages\pony\orm\core.py", line 1822, in next
% self.entity.__name__)
File "C:\Python27\lib\site-packages\pony\utils.py", line 92, in throw
raise exc
TypeError: Use select(...) function or Person.select(...) method for iteration
>>>
Seems like they are doing more arcane incantations like inspecting the select
function call and processing the Python abstract syntax grammar tree on the fly.
I still would like to see someone explaining it, the source is way beyond my wizardry level.
Pony ORM author is here.
Pony translates Python generator into SQL query in three steps:
The most complex part is the second step, where Pony must understand the "meaning" of Python expressions. Seems you are most interested in the first step, so let me explain how decompiling works.
Let's consider this query:
Which will be translated into the following SQL:
And below is the result of this query which will be printed out:
The
select()
function accepts a python generator as argument, and then analyzes its bytecode. We can get bytecode instructions of this generator using standard pythondis
module:Pony ORM has the function
decompile()
within modulepony.orm.decompiling
which can restore an AST from the bytecode:Here, we can see the textual representation of the AST nodes:
Let's now see how the
decompile()
function works.The
decompile()
function creates aDecompiler
object, which implements the Visitor pattern. The decompiler instance gets bytecode instructions one-by-one. For each instruction the decompiler object calls its own method. The name of this method is equal to the name of current bytecode instruction.When Python calculates an expression, it uses stack, which stores an intermediate result of calculation. The decompiler object also has its own stack, but this stack stores not the result of expression calculation, but AST node for the expression.
When decompiler method for the next bytecode instruction is called, it takes AST nodes from the stack, combines them into a new AST node, and then puts this node on the top of the stack.
For example, let's see how the subexpression
c.country == 'USA'
is calculated. The corresponding bytecode fragment is:So, the decompiler object does the following:
decompiler.LOAD_FAST('c')
. This method puts theName('c')
node on the top of the decompiler stack.decompiler.LOAD_ATTR('country')
. This method takes theName('c')
node from the stack, creates theGeattr(Name('c'), 'country')
node and puts it on the top of the stack.decompiler.LOAD_CONST('USA')
. This method puts theConst('USA')
node on top of the stack.decompiler.COMPARE_OP('==')
. This method takes two nodes (Getattr and Const) from the stack, and then putsCompare(Getattr(Name('c'), 'country'), [('==', Const('USA'))])
on the top of the stack.After all bytecode instructions are processed, the decompiler stack contains a single AST node which corresponds to the whole generator expression.
Since Pony ORM needs to decompile generators and lambdas only, this is not that complex, because the instruction flow for a generator is relatively straightforward - it is just a bunch of nested loops.
Currently Pony ORM covers the whole generator instructions set except two things:
a if b else c
a < b < c
If Pony encounters such expression it raises the
NotImplementedError
exception. But even in this case you can make it work by passing the generator expression as a string. When you pass a generator as a string Pony doesn't use the decompiler module. Instead it gets the AST using the standard Pythoncompiler.parse
function.Hope this answers your question.