Creating a database in Orientdb in distributed mod

2019-08-12 01:30发布

问题:

Our system creates OrientDB databases programmatically and uses one database for each customer (before anyone jump on dismissing this design, the reasons are security, possibility to move certain customer/data between datacenters/regions and the possibility to relocation to on-premise).

This works great in OrientDB in single mode. However, when the database is setup in distributed mode (3 servers, on amazon). The behaviour is, to put it mildly, weird. I know the docs doesn't say anything about this being supported, but I couldn't find anything that says it doesn't either.

Sometimes the database is created fine, but the client locks indefinitely (in OAdaptiveLock.lock()). Sometimes the whole cluster needs to be restarted to be able to use the database and sometimes, as it is as the time of writing, one OrientDB node shuts down by itself after it seems to be synching with the others (Address[1.2.3.4]:2434 is SHUTTING_DOWN [LifecycleService] -> Terminating forcefully... [Node]). The error message is proceeded by a stacktrace (see below).

So, to my questions:

  1. Do OrientDB support database creations online in distributed mode?
  2. If so, what can I be doing wrong?
  3. If not, is there any plans on supporting this in the future?

Thanks in advance!

./Anders

Stacktrace:

2016-01-28 14:00:01:395 SEVER [infogile02] error on creating cluster 'superclassesedge_infogile02' in class 'superClassesEdge':  [OHazelcastPlugin][infogile02] Error on starting distributed plugin
com.orientechnologies.orient.server.distributed.ODistributedException: com.orientechnologies.orient.server.distributed.ODistributedException: Error on creating cluster 'superclassesedge_infogile02' in class 'superClassesEdge'
    at com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.configureDatabase(OHazelcastDistributedDatabase.java:241)
    at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.installDatabaseFromNetwork(OHazelcastPlugin.java:1131)
    at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.requestDatabase(OHazelcastPlugin.java:971)
    at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.installDatabase(OHazelcastPlugin.java:908)
    at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.installNewDatabases(OHazelcastPlugin.java:1468)
    at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.startup(OHazelcastPlugin.java:185)
    at com.orientechnologies.orient.server.OServer.registerPlugins(OServer.java:979)
    at com.orientechnologies.orient.server.OServer.activate(OServer.java:346)
    at com.orientechnologies.orient.server.OServerMain.main(OServerMain.java:41)
Caused by: com.orientechnologies.orient.server.distributed.ODistributedException: Error on creating cluster 'superclassesedge_infogile02' in class 'superClassesEdge'
    at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.installLocalClusterPerClass(OHazelcastPlugin.java:1631)
    at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.installDbClustersForLocalNode(OHazelcastPlugin.java:1300)
    at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin$2.call(OHazelcastPlugin.java:1134)
    at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin$2.call(OHazelcastPlugin.java:1131)
    at com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.configureDatabase(OHazelcastDistributedDatabase.java:239)
    ... 8 more
Caused by: com.orientechnologies.orient.core.exception.ODatabaseException: Error on saving record #0:1
    at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.executeSaveRecord(ODatabaseDocumentTx.java:2044)
    at com.orientechnologies.orient.core.tx.OTransactionNoTx.saveRecord(OTransactionNoTx.java:159)
    at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.save(ODatabaseDocumentTx.java:2568)
    at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.save(ODatabaseDocumentTx.java:121)
    at com.orientechnologies.orient.core.record.impl.ODocument.save(ODocument.java:1768)
    at com.orientechnologies.orient.core.record.impl.ODocument.save(ODocument.java:1764)
    at com.orientechnologies.orient.core.metadata.schema.OSchemaShared$1.call(OSchemaShared.java:1213)
    at com.orientechnologies.orient.core.db.OScenarioThreadLocal.executeAsDistributed(OScenarioThreadLocal.java:71)
    at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.saveInternal(OSchemaShared.java:1208)
    at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.releaseSchemaWriteLock(OSchemaShared.java:642)
    at com.orientechnologies.orient.core.metadata.schema.OClassImpl.releaseSchemaWriteLock(OClassImpl.java:1824)
    at com.orientechnologies.orient.core.metadata.schema.OClassImpl.releaseSchemaWriteLock(OClassImpl.java:1819)
    at com.orientechnologies.orient.core.metadata.schema.OClassImpl.addCluster(OClassImpl.java:1088)
    at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.installLocalClusterPerClass(OHazelcastPlugin.java:1624)
    ... 12 more
Caused by: java.lang.NullPointerException
    at com.orientechnologies.orient.core.storage.impl.local.paginated.atomicoperations.OAtomicOperationsManager.endAtomicOperation(OAtomicOperationsManager.java:148)
    at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.doUpdateRecord(OAbstractPaginatedStorage.java:2046)
    at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.updateRecord(OAbstractPaginatedStorage.java:971)
    at com.orientechnologies.orient.server.distributed.ODistributedStorage.updateRecord(ODistributedStorage.java:708)
    at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.executeSaveRecord(ODatabaseDocumentTx.java:2005)
    ... 25 more

2016-01-28 14:00:01:398 INFO  [10.0.9.105]:2434 [orientdb] [3.5.3] Address[10.0.9.105]:2434 is SHUTTING_DOWN [LifecycleService]
2016-01-28 14:00:01:398 WARNI [10.0.9.105]:2434 [orientdb] [3.5.3] Terminating forcefully... [Node]
2016-01-28 14:00:01:399 INFO  [10.0.9.105]:2434 [orientdb] [3.5.3] Shutting down connection manager... [Node]

回答1:

Severe case of tl;dr on by behalf. Docs on distributed architecture in Orientdb clearly states "creation of a database on multiple nodes could cause synchronization problems when clusters are automatically created. Please create the databases before to run in distributed mode" but I didn't read that far.

By the docs, the suggested solution seems to be "Partitioned Graphs" (described here http://orientdb.com/docs/2.0/orientdb.wiki/Partitioned-Graphs.html). That solution doesn't really address all our concerns, but is in theory good enough.

However, practically that doesn't work, it requires a significant rewrite since the transactions needs to be managed differently. More on that in another topic....