We have set up an Solr index containing 36 million documents (~1K-2K each) and we try to query a maximum of 100 documents matching a single simple keyword. This works pretty fast as we had hoped for. However, if we now add "&sort=createDate+desc" to the query (thus asking for the top 100 'new' documents matching the query) it runs for a long, very long time and finally results in an OutOfMemoryException. From what I've understood from the manual this is caused by the fact that Lucene needs to load all the distinct values for this field (createDate) into memory (the FieldCache afaik) before it can execute the query. As the createDate field contains date and time the number of distinct values is pretty large. Also important to mention is that we frequently update the index.
Perhaps someone can provide some insights and directions on how we can tune Lucene / Solr or change our approach in such a way that query times become acceptable? Your input will be much appreciated! Thanks.