Solr - how to “group by” and “limit”?

2019-02-17 02:06发布

问题:

Say I indexed the following from my database:

======================================
| Id |  Code | Description           |
======================================
| 1  | A1    | Hello world           |
| 2  | A1    | Hello world 123       |
| 3  | A1    | World hello hi        |
| 4  | B1    | Quick fox jumped      |
| 5  | B1    | Lazy dog              |
...

Further, say the user searches for "hello", which should return records 1, 2, and 3. Is there a way to make Solr "group by" the Code field and apply a limit (say, 10 records)? I'm somewhat looking for a SQL counterpart of GROUP BY and LIMIT.

Also, when it does this "group by", I want it to choose the most relevant document and use that document's Description field as part of the return.

Of course, I could just have Solr return everything to my application and I can manipulate the results to do the GROUP BY and LIMIT. I'd rather not do this if possible.

回答1:

Have a look at field collapsing, available in Solr 4.0. Sorting groups on relevance: group.sort=score desc.



回答2:

http://XXX.XXX.XXX.XXX:8080/solr/autocomplete/select?q=displayterm:new&wt=json&indent=true&q.op=and&fl=displayterm&group=true&group.field=displayterm&rows=3&start=0

Note:

Response: start -> response start your id. rows -> how do you wat number of rows .

Exp 
 1 step 
  &start=0&rows=3
2 step 
  &start=3&rows=3
3 step
  &start=6&rows=3
etc.


{
  "responseHeader":{
    "status":0,
    "QTime":1,
    "params":{
      "fl":"displayterm",
      "indent":"true",
      "start":"0",
      "q":"displayterm:new",
      "q.op":"and",
      "group.field":"displayterm",
      "group":"true",
      "wt":"json",
      "rows":"3"}},
  "grouped":{
    "displayterm":{
      "matches":231,
      "groups":[{
          "groupValue":null,
          "doclist":{"numFound":220,"start":0,"docs":[
              {
                "displayterm":"Professional News"}]
          }},
        {
          "groupValue":"general",
          "doclist":{"numFound":1,"start":0,"docs":[
              {
                "displayterm":"General News"}]
          }},
        {
          "groupValue":"delhi",
          "doclist":{"numFound":2,"start":0,"docs":[
              {
                "displayterm":"New Delhi"}]
          }}]}}}


回答3:

add the following field to your query

  • 'group':'true',
  • 'group.field':'source',
  • 'group.main':'true',
  • 'group.limit':10,