I am using Zend_Search_Lucene to implement site search. I created separate indices for different data types (e.g. one for users, one for posts etc). The results are similarly divided by data type however there is an 'all' option which should show a combination of the different result types. Is it possible to search across the different indices at once? or do i have to index everything in an all index?
Update: The readme for ZF 1.8 suggests that it's now possible to do in ZF 1.8 but I've been unable to track down where this is at in the documentation.
So after some research you have to use Zend_Search_Lucene_Interface_MultiSearcher. I don't see any mention of it in the documentation as of this writing but if you look at the actual class in ZF 1.8 it's straightforward t use
$index = new Zend_Search_Lucene_Interface_MultiSearcher();
$index->addIndex(Zend_Search_Lucene::open('search/index1'));
$index->addIndex(Zend_Search_Lucene::open('search/index2'));
$index->find('someSearchQuery');
NB it doesn't follow PEAR syntax so won'w work with Zend_Loader::loadClass
That's exactly how I handled search for huddler.com. I used multiple Zend_Search_Lucene indexes, one per datatype. For the "all" option, I simply had another index, which included everything from all indexes -- so when I added docs to the index, I added them twice, once to the appropriate "type" index, and once to the "all" index. Zend Lucene is severely underfeatured compared to other Lucene implementations, so this was the best solution I found. You'll find that Zend's port supports only a subset of the lucene query syntax, and poorly -- even on moderate indexes (10-100 MB), queries as simple as "a*", or quoted phrases fail to perform adequately (if at all).
When we brought a large site onto our platform, we discovered that Zend Lucene doesn't scale. Our index reached roughly 1.0 GB, and simple queries took up to 15 seconds. Some queries took a minute or longer. And building the index from scratch took about 20 hours.
I switched to Solr; Solr not only performs 50x faster during indexing, and 1000x faster for many queries (most queries finish in < 5ms, all finish in < 100ms), it's far more powerful. Also, we were able to rebuild our 100,000+ document index from scratch in 30 minutes (down from 20 hours).
Now, everything's in one Solr index with a "type" field; I run multiple queries against the index for each search, each one with a different "type:" filter query, and one without a "type:" for the "all" option.
If you plan on growing your index to 100+ MB, you receive at least a few search requests per minute, or you want to offer any sort of advanced search functionality, I strongly recommend abandoning Zend_Search_Lucene.
I don't how it integrates with Zend, but in Lucene one would use a MultiSearcher, instead of the usual IndexSearcher.