Execute MLCP Content Load Command as a schedule ta

2019-07-20 14:21发布

问题:

Is there any possible way to bulk load data using MLCP as a scheduled task in Marklogic

回答1:

You can't invoke mlcp via a scheduled task; I recommend trying something like Apache Camel for this.

Camel has a Timer component and a Quartz component, either of which can be used for scheduling.

And here's an example Camel file with a route (commented out, but still operable) that is initiated by a Timer which then writes a file to disk and ingests it via mlcp - https://github.com/rjrudin/ml-camel-client/blob/master/src/main/resources/META-INF/camel-routes.xml .

I've had good success with doing all kinds of processing/scheduling in Camel and then ultimately ingesting content via mlcp. I think it's a good fit for your use case here so you can leverage what mlcp does best - get content into MarkLogic as fast as possible.



回答2:

Scheduled tasks inside MarkLogic can call external services (using HTTP), but they don't have a way to run an external command. You do have some options:

  • schedule the MLCP job externally, using cron on Linux or something along those lines;
  • restructure your load using JavaScript or XQuery; you can retrieve data from a file system, run it through some transforms, and insert it into the database using modules running in MarkLogic;
  • set up a Java app server, have your scheduled task make an HTTP request to that server and have the Java app server call MLCP

I think I'd start with the first option, but which one is best depends on your use case.