Can Solr or Lucene be used for searching XML?

2019-08-07 05:12发布

问题:

I have a database of information that is tagged using XML. The XML represents a hierarchy that I would like to account for in search and query. For example, if the data is book metadata:

<book>
    <author id="jd112">John Doe</author>
    <title>John's First Publication</title>
    <summary>This is a mundane memoir of John's life that no one else would care to read </summary>
</book>

I'll have tons of such XML documents. I would like searchers to restrict queries to specific fields. I would like to also allow searchers to do logical combinations of those.

Does Lucene/Solr provide such an ability, or should I be looking at some other technology? If Lucene it is, a pointer to how I might go about this would be helpful.

Thanks for your insights.

-Raj

回答1:

Yes, and it is the best way to use it, but documents should be reformatted possibly

http://www.xml.com/pub/a/2006/08/09/solr-indexing-xml-with-lucene-andrest.html

and google about configuring schema.xml



回答2:

You can import your xml files without needing to convert them yourself to the Sorl xml format, just use DataImportHandler and apply an xsl tranformation



回答3:

There are several ways of indexing XML documents.

  1. You can user search engines technologies including e.g., Apache Sor and ElasticSearch, both are based on Lucene for indexing.
  2. Use NoSQL database technologies, e.g., LuX for XML, which is based on Lucene

Hope this helps



标签: xml solr lucene