How can I do indexing XML files stored on other se

2019-02-26 09:15发布

问题:

I have all my XML files stored on to the other server and I have installed and configure the SOLR on different server. How can I index those XML files into the SOLR. I have checked nutch but it's main purpose is to crawl the html pages and index them. I don't need to crawl. I have All those files on specific path on other server. I just need to do indexing those XML files in SOLR. I have installed and configure SOLR4.

If anyone have did some thing like this please let me know how to do that. Thank you

回答1:

Why not mount the drive from your Solr server, and do something like:

java -jar post.jar "Z:\home\data\delivery\textarticles.xml"

post.jar is in the exampledocs folder. You might also use it as an example application and build your own application to post those xml files from the other server



回答2:

Take a look at the DataImportHandler. I think you should be able to access a network file if it has the proper permissions set up.



回答3:

Based on your comment to Shane Alexander's answer, you will need to use the URLDataSource option of the DataImportHandler to retrive the file via a Url. Additionally, you will need to incorporate the patch from SOLR-1490 to allow for authentication support.