How to store and retrieve Apache Solr fields as a

2019-07-21 23:31发布

问题:

I need to store and retrieve results as multidimensional tree instead of flat "key" => "value" pairs. Let me explain with an example, I have products which have many categories and each category has a priority value. Sample structure:

{
  name: "Sample Product"
  categories: [
  {
    category: "Category 1",
    priority: 9
  },
  {
    category: "Category 2",
    priority: 5
  }
  ...
  ]
}

Here is my data-config.xml:

<dataConfig>
  <dataSource name="ds" type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost/dbname" user="user" password="pass"/>

  <document name="products">
    <entity name="product" query="SELECT name FROM dbname.product">

      <field name="name" column="name" />

      <entity name="categories" query="SELECT category, priority FROM dbname.category WHERE product_id='${product.id}'">
        <field name="category" column="category" />
        <field name="priority" column="priority" />
      </entity>
    </entity>
  </document>
</dataConfig>

Schema.xml:

<schema name="Products" version="1.1">
  <fields>
    <field name="name" type="string" indexed="true" stored="true" multiValued="false" required="true" /> 
    <field name="category" type="string" indexed="true" stored="true" multiValued="true" /> 
    <field name="priority" type="sint" indexed="true" stored="true" multiValued="true" /> 
  </fields>

  ...
</schema>

And this is a sample query result:

{
  "responseHeader": {
    "status": 0,
    "QTime": 14
  },
  "response": {
    "numFound": 45,
    "start": 0,
    "docs": [
      {
        "name": "Product Name",
        "category": [
          "Category 1",
          "Category 2"
        ],
        "priority": [
          8,
          6
        ]
      },
      ...

What I want is something like this:

{
  "responseHeader": {
    "status": 0,
    "QTime": 14
  },
  "response": {
    "numFound": 45,
    "start": 0,
    "docs": [
      {
        "name": "Product Name",
        "categories": [
          {
            "name": "Category 1",
            "priority": 8
          },
          {
            "name": "Category 2",
            "priority": 6
          }
          ...
        ]
      },
      ...

So when I sort the result based on priority I will not lose the connection between category and priority. Thus I can pick top 1, 2 or 3 categories for each product in PHP. Otherwise I have to do some custom sorting on PHP side to pick top categories which I don't want. I want to do all searching and sorting on Solr server.

I am using Apache Solr 4.5.1

回答1:

Solr can only maintain a 'flat' representation of the data. What you are trying to do is not really possible. There are a number of workarounds, such as using dynamic fields and using a solr join to link multiple data sets.



回答2:

Here is one quick way to achieve this. You can concatenate priority and name field.

So your data would look like:

{
  name: "Sample Product"
  categories: [
  {
    priority_category: "9_Category 1",
  },
  {
    priority_category: "5_Category 2",
  }
  ...
  ]
}

Then you can natively sort in Solr on priority_category field, and then if you want to output any of these fields, you can split at PHP level, using explode or something.



回答3:

Try 4.8 with Index time block join to retrive the results like that. But this is for joining the parent child documents. Otherwise you need to go with the above recomended string concatination solution.



标签: solr solr4