I have a graph with a couple of indices. They're two composite indices with label restraints. (both are exactly the same just on different properties/labels). One definitely seems to work but the other doesn't. I've done the following profile() to doubled check:
One is called KeyOnNode
: property uid
and label node
:
gremlin> g.V().hasLabel("node").has("uid", "xxxxxxxx").profile().cap(...)
==>Traversal Metrics
Step Count Traversers Time (ms) % Dur
=============================================================================================================
TitanGraphStep([~label.eq(node), uid.eq(dammit_... 1 1 2.565 96.84
optimization 1.383
backend-query 1 0.231
SideEffectCapStep([~metrics]) 1 1 0.083 3.16
>TOTAL - - 2.648 -
The above is perfectly acceptable and works well. I'm assuming the magic line is backend-query
.
The other is called NameOnSuperNode
: property name
and label supernode
:
gremlin> g.V().hasLabel("supernode").has("name", "xxxxxxxx").profile().cap(...)
==>Traversal Metrics
Step Count Traversers Time (ms) % Dur
=============================================================================================================
TitanGraphStep([~label.eq(supernode), name.eq(n... 1 1 5763.163 100.00
optimization 2.261
scan 0.000
SideEffectCapStep([~metrics]) 1 1 0.073 0.00
>TOTAL - - 5763.236 -
Here the query takes an outrageous amount of time and we have a scan
line. I originally wondered if the index wasn't commit through the management system but alas the following seems to work just fine :
gremlin> m = graphT.openManagement();
==>com.thinkaurelius.titan.graphdb.database.management.ManagementSystem@73c1c105
gremlin> index = m.getGraphIndex("NameOnSuperNode")
==>NameOnSuperNode
gremlin> index.getFieldKeys()
==>name
gremlin> import static com.thinkaurelius.titan.graphdb.types.TypeDefinitionCategory.*
==>null
gremlin> sv = m.getSchemaVertex(index)
==>NameOnSuperNode
gremlin> rel = sv.getRelated(INDEX_SCHEMA_CONSTRAINT, Direction.OUT)
==>com.thinkaurelius.titan.graphdb.types.SchemaSource$Entry@26b2b8e2
gremlin> sse = rel.iterator().next()
==>com.thinkaurelius.titan.graphdb.types.SchemaSource$Entry@2d39a135
gremlin> sse.getSchemaType()
==>supernode
I can't just reset the db at this point. Any help pinpointing what the issues could be would be amazing, I'm hitting a wall here. Is this a sign that I need to reindex?
INFO: Titan DB 1.1 (TP 3.1.1)
Cheers
UPDATE : I've found that the index in question is not in a REGISTERED
state:
gremlin> :> m = graphT.openManagement(); index = m.getGraphIndex("NameOnSuperNode"); pkey = index.getFieldKeys()[0]; index.getIndexStatus(pkey)
==>INSTALLED
How do I get it to register? I've tried m.updateIndex(index, SchemaAction.REGISTER_INDEX).get(); m.commit(); graphT.tx().commit();
but it doesn't seem to do anything
UPDATE 2 : I've tried regitering the index in order to reindex with the following :
gremlin> m = graphT.openManagement();
index = m.getGraphIndex("NameOnSuperNode") ;
import static com.thinkaurelius.titan.graphdb.types.TypeDefinitionCategory.*;
import com.thinkaurelius.titan.graphdb.database.management.ManagementSystem;
m.updateIndex(index, SchemaAction.REGISTER_INDEX).get();
ManagementSystem.awaitGraphIndexStatus(graphT, "NameOnSuperNode").status(SchemaStatus.REGISTERED).timeout(20, java.time.temporal.ChronoUnit.MINUTES).call();
m.commit();
graphT.tx().commit()
But this isn't working. I still have my index in the INSTALLED
status and I'm still getting a timeout. I've checked that there were no open transactions. Anyone have an idea? FYI the graph is running on a single server and has ~100K vertices and ~130k edges.
So there are a few things that can be happening here:
If both of those indices you describe were not created in the same transaction (and the problem index in question was created in after the
name
propertyKey
was already defined) then you should issue a reindex, as per Titan docs:The index may be timing out the process that takes to move from
REGISTERED
toINSTALLED
, in which case you want to usemgmt.awaitGraphIndexStatus()
. You can even specify the amount of time you are willing to wait here.Make sure there are no open transactions on your graph or the index status will indeed not change, as described here.
This is clearly not the case for you, but there is a bug in Titan (fixed in JanusGraph via this PR) such that if you create an index against a newly created propertyKey as well as a previously used propertyKey, the index will get stuck in the
REGISTERED
stateIndexes will not move to
REGISTERED
unless every Titan/JanusGraph node in the cluster acknowledges the index creation. If your indexes are getting stuck in theINSTALLED
state, there is a chance that the other nodes in the system are not acknowledging the indexes existence. This can be due to issues with another server in the cluster, backfill in the messaging queue Titan/JanusGraph uses to talk with each other, or most unexpectedly: the existence of phantom instances. These can occur every time your server is killed through non-normal JVM shutdown processes, i.e.kill -9
the server due to it being stuck in thrash the world garbage collection. If you expect backfill to be the problem, the comments in this class offer good insight to customizable configuration options that may help fix the problem. To check for the existence of phantom nodes, use this function and then this function to kill the phantom instances.I think you missed config to your graph. If you used backend is cassandra, you must config with elasticsearch. If you used backend is hbase, you must config with caching. Read more in link below: https://docs.janusgraph.org/0.2.0/configuration.html