Solr / Lucene: Get all field names sorted by numbe

2019-06-11 00:24发布

问题:

I want to get the list of all fields (i.e. field names) sorted by the number of times they occur in the Solr index, i.e.: most frequently occurring field, second most frequently occurring field and so on.

Alternatively, getting all fields in the index and the number of times they occur would also be sufficient.

How do I accomplish this either with a single solr query or through solr/lucene java API?

The set of fields is not fixed and ranges in the hundreds. Almost all fields are dynamic, except for id and perhaps a couple more.

回答1:

As stated in Solr: Retrieve field names from a solr index? you can do this by using the LukeRequesthandler.

To do so you need to enable the requestHandler in your solrconfig.xml

<requestHandler name="/admin/luke" class="org.apache.solr.handler.admin.LukeRequestHandler" />

and call it

http://solr:8983/solr/admin/luke?numTerms=0

If you want to get the fields sorted by something you are required to do this on your own. I would suggest to use Solrj in case you are in a java environment.

Fetch fields using Solrj

@Test
public void lukeRequest() throws SolrServerException, IOException {
  SolrServer solrServer = new HttpSolrServer("http://solr:8983/solr");

  LukeRequest lukeRequest = new LukeRequest();
  lukeRequest.setNumTerms(1);
  LukeResponse lukeResponse = lukeRequest.process(solrServer );

  List<FieldInfo> sorted = new ArrayList<FieldInfo>(lukeResponse.getFieldInfo().values());
  Collections.sort(sorted, new FieldInfoComparator());
  for (FieldInfo infoEntry : sorted) {
    System.out.println("name: " + infoEntry.getName());
    System.out.println("docs: " + infoEntry.getDocs());
  }
}

The comparator used in the example

public class FieldInfoComparator implements Comparator<FieldInfo> {
  @Override
  public int compare(FieldInfo fieldInfo1, FieldInfo fieldInfo2) {
    if (fieldInfo1.getDocs() > fieldInfo2.getDocs()) {
      return -1;
    }
    if (fieldInfo1.getDocs() < fieldInfo2.getDocs()) {
      return 1;
    }
    return fieldInfo1.getName().compareTo(fieldInfo2.getName());
  }
}