I'm working on the application at http://demos.zatechcorp.com/codeigniter/
In its current incarnation running on my machine, I loaded the ZendFramework inside Codeigniter, and generated an index, like this:
// ... Some code that loads all the markets
foreach ($markets as $market)
{
$doc = new Zend_Search_Lucene_Document();
// Id for retrieval
$doc->addField(Zend_Search_Lucene_Field::UnIndexed('id', $market->id));
// Store document URL to identify it in search result.
$doc->addField(Zend_Search_Lucene_Field::Text('url', $market->permalink));
// Index document content
$doc->addField(Zend_Search_Lucene_Field::UnStored('contents', $market->description));
// Title
$doc->addField(Zend_Search_Lucene_Field::Text('title', $market->title));
// Phone
$doc->addField(Zend_Search_Lucene_Field::Keyword('phone', $market->phone));
// Fax
$doc->addField(Zend_Search_Lucene_Field::Keyword('fax', $market->fax));
// Street
$doc->addField(Zend_Search_Lucene_Field::Keyword('street', $market->street));
// City
$doc->addField(Zend_Search_Lucene_Field::Keyword('city', $market->city));
// State
$doc->addField(Zend_Search_Lucene_Field::Keyword('state', $market->state));
// Zip
$doc->addField(Zend_Search_Lucene_Field::Keyword('zip', $market->zip));
// Type
$doc->addField(Zend_Search_Lucene_Field::UnIndexed('type', 'market'));
// Store Document
$index->addDocument($doc);
}
In my search, I do this:
$hits = $index->find($q);
This works with simple words, but when I want to do a search like "Sheba Foods" (quotes included), it returns one result, but the wrong one, which doesn't even have the word "Sheba".
I moved away from MySQL full-text search because of its obvious problems, and can't make any headway with this.
I've been looking at the Zend_Search_Lucene_Search_QueryParser::parse() method. Does the answer lie in this method?
I have used MySQL full-text search in the past, but it's really CPU intensive.
You could always rely on a SELECT * FROM table WHERE column = '%query%'
:)
I figured it out. With Lucene, you can add a field with the name 'id', but retrieving id from a hit gives you something different -- I'll guess this is the id of the search term within the entire search results.
What I had to do in this case was use a different field name like this: