Bulk load data in titan db from nodejs

2019-09-09 20:48发布

问题:

My current scenario is like

  1. I have a rabbit mq which gives me the details of the order placed.
  2. On the other side I have my titan db (cassandra storage, es index backends and gremlin server).
  3. Yet another I have nodejs application which can interact with gremlin server through http api using https://www.npmjs.com/package/gremlin . I am able to make hits to my graph database from here.

Now what I am trying to do is load data from rabbit mq into titan db.

What I have been able to do till now is load the data from nodejs file using gremlin node module

    var createClient = require('gremlin').createClient;
//import { createClient } from 'gremlin';
 
const client = createClient();

client.execute('tx=graph.newTransaction();tx.addVertex(T.label,"product","id",991);tx.commit()', {}, function(err, results){
  if (err) {
    return console.error(err)
  }
    console.log(results)
});

How should I move next so that I can harness existing rabbit mq of orders and push them into titan db.

Due to some constraints I can not use java.

回答1:

You're most likely looking for something like node-amqp, which is a Node.js client for RabbitMQ. What you want to do is:

  1. Establish a connection to Gremlin Server
  2. Establish a connection to RabbitMQ
  3. Listen to a RabbitMQ queue for messages
  4. Send these messages to Gremlin, creating graph elements

Things you must watch for that will otherwise likely kill your performance:

  1. Send Gremlin queries with bound parameters
  2. Batch messages: create multiple vertices and commit them in the same transaction (= same Gremlin query, unless in session mode where you .commit() yourself). Numbers in the couple thousands should work.
  3. Watchout for back-pressure and make sure you don't flood your Titan instances with more messages than they can handle.

I'm not familiar with RabbitMQ but hopefully this should get you started.

Note: Gremlin javascript driver interacts with Gremlin Server via a WebSocket connection, which is permanent and bi-directional. The client doesn't support the HTTP Channelizer yet (which is not the kind of connection that you wish to establish in the current scenario).