I'm using bulbflow (python) with Neo4j and I'm trying to add an index only on a subset of my keys (for now, simply keys named 'name' for optional index-based lookup).
I don't love the bulbflow Models (too restrictive) and I couldn't figure out how to do selective indexing without changing code since the 'autoindex' is a global setting -- I don't see how to configure it based on the key.
Has anyone done something like this?
-Andrew
You can disable Bulbs auto-indexing by setting g.config.autoindex
to False
.
See https://github.com/espeed/bulbs/blob/master/bulbs/config.py#L62
>>> from bulbs.neo4jserver import Graph
>>> g = Graph()
>>> g.config.autoindex = False
>>> g.vertices.create(name="James")
In the example above, this will cause the name
property not to be indexed automatically.
Setting autoindex
to False
will switch to using the low-level client's create_vertex()
method instead of the create_indexed_vertex()
method:
See https://github.com/espeed/bulbs/blob/master/bulbs/neo4jserver/client.py#L422
The create_indexed_vertex()
method has a keys
arg, which you can use for selective indexing:
See https://github.com/espeed/bulbs/blob/master/bulbs/neo4jserver/client.py#L424
This is the low-level client
method used by Bulbs models. You generally don't need to explicitly call the low-level client methods, but if you do, you can selectively index properties by including the property name in the keys arg.
To selectively index properties in a Model, simply override get_index_keys()
in your Model definition:
See https://github.com/espeed/bulbs/blob/master/bulbs/model.py#L383
By default, Bulbs models index all properties. If no keys are provided, then all properties are indexed (like in TinkerPop/Blueprints).
See the Model _create() and get_bundle() methods:
_create()
https://github.com/espeed/bulbs/blob/master/bulbs/model.py#L583
get_bundle()
https://github.com/espeed/bulbs/blob/master/bulbs/model.py#L363
get_index_keys()
https://github.com/espeed/bulbs/blob/master/bulbs/model.py#L383
To enable selective indexing for generic vertices and edges, I updated the Bulbs generic vertex/edge methods to include a _keys
arg where you can supply a list of property names (keys) to index.
See https://github.com/espeed/bulbs/commit/4fe39d5a76675020286ec9aeaa8e71d58e3a432a
Now, to selectively index properties on generic vertices/edges, you can supply a list of property names to index:
>>> from bulbs.neo4jserver import Graph
>>> g = Graph()
>>> g.config.autoindex = False
>>> james = g.vertices.create(name="James", city="Dallas", _keys=["name"])
>>> julie = g.vertices.create(name="Julie", city="Dallas", _keys=["name"])
>>> g.edges.create(james, "knows", julie, timestamp=12345, someprop="somevalue", _keys=["someprop"])
In the example above, the name
property will be indexed for each vertex, and someprop
will be indexed for the edge. Note that city
and timestamp
will not be indexed because those property names were not explicitly included in the list of index keys.
If g.config.autoindex
is True
and _keys
is None
(the default), all properties will be indexed (just like before).
If g.config.autoindex
is False
and _keys
is None
, no properties will be indexed.
If _keys
is explicitly set to a list of property names, only those properties will be indexed, regardless if g.config.autoindex
is True
or False
.
See https://github.com/espeed/bulbs/blob/master/bulbs/neo4jserver/client.py#L422
NOTE: How auto-indexing works differs somewhat if you're using Neo4j Server, Rexster, or Titan Server, and the indexing architecture for all the graph-database servers has been in a state of flux for the past few months. It appears that all are moving from a manual-indexing system to auto-indexing.
For graph-database servers that did not have auto-indexing capability until recently (e.g. Neo4j Server), Bulbs enabled auto-indexing via custom Gremlin scripts that used the database's low-level manual indexing methods:
- https://github.com/espeed/bulbs/blob/master/bulbs/neo4jserver/client.py#L1008
- https://github.com/espeed/bulbs/blob/master/bulbs/neo4jserver/gremlin.groovy#L11
However, manual indexing has been deprecated among Neo4j Server, TinkerPop/Rexster, and Titan Server so Bulbs 0.4 indexing architecture will change accordingly. Selective indexing will still be possible by declaring your index keys upfront, like you would in an SQL create table
statement.
BTW: What about did you find restrictive about Models? Bulbs Models (actually the entire library) is designed to be flexible so you can modify it to whatever you need.
See the Lightbulb example for how to customize Bulbs Models: Is there a equivalent to commit in bulbs framework for neo4j
Let me know if you have any questions.