Working with Solr 3.6 for ASP.net application. We're using SolrNet library.
We have a custom program written in asp.net to incrementally add a documents to Solr using SolrNet. This monitors the progress for inserting documents into Solr.
The issue is, the application shows process as completed but checking into Solr results we only see few documents in results, and not all of the document. Although checking it again after 15 minutes, few more documents are now listed in Solr results, which are roughly a double of what was initial results. Note that we didn't executed any process to add documents into Solr.
Is it natural with Solr? Or can we assume to list all documents as soon as they are inserted & committed? What is the reason behind this kind of behavior? And how to handle it?
Edit 1
After an hour on application side we're able to query 80-90% of documents. But still Solr Admin Query doesn't list more than 25% documents.
Are you issuing a commit to Solr after your custom ASP.NET program has completed adding documents to Solr? Because your new documents will not be visible to the searchers within Solr until you have committed them to the index.
var solr = ServiceLocator.Current.GetInstance<ISolrOperations<IndexEntry>>();
solr.Add(entry);
solr.Commit();
I am guessing that you are seeing documents appear after some time because your Solr instance is configured with some sort of <autoCommit>
setting in your solrconfig.xml file. See here for more details
One thing to try is that you can pass a "soft-commit" parameter via SolrNet to tell the index how soon to commit the new document that you have added to the index. Here is a small snippet of code that shows the use of the CommitWithin
AddParameter which tells Solr to commit the document within 5 seconds.
var solr = ServiceLocator.Current.GetInstance<ISolrOperations<IndexEntry>>();
solr.Add(entry, new AddParameters { CommitWithin = 5000 });
I would recommend the use of the CommitWithin
parameter versus the explicit Commit()
as commits are expensive operations and Solr can better manage those itself.