I havent used lucene. Last time i ask (many months ago, maybe a year) people suggested lucene. If i shouldnt use lucene what should i use? As am example say there are items tagged like this
- apples carrots
- apples
- carrots
- apple banana
if a user search apples i dont care if there is any preference from 1,2 and 4. However i seen many forums do this which i HATED is when a user search apple carrots 2 and 3 have high results while 1 is hard to find even though it matches my search more closely.
Also i would like the ability to do search carrots -apples which will only get me 3. I am not sure what should happen if i search carrots banana but anyways as long as more items tagged with 2 and 3 results are lower ranking then 1 when i search apples carrots i'll be happy.
Can lucene do this? and where do i start? I tried looking it up and when i do i see a lot of classes and i'll see tutorials talking about documents, webpages but none were clear about what to do when i like to tag something. If not lucene what should i use for tagging?
Edit: You can use Lucene. Here's an explanation how to do this in Lucene.net. Some Lucene basics are:
Please read this blog post about creating and using a Lucene.net index.
I assume you are tagging blog posts. If I am totally wrong, please say so. In order to search for tags, you need to represent them as Lucene entities, namely as tokens inside a "tags" field.
One way of doing so, is assigning a Lucene document per blog post. The document will have at least the following fields:
Indexing: Whenever you add a tag to a post, remove a tag or edit it, you will need to index the post. The Analyzer will transform the fields into their token representation.
The remaining part is retrieval. For this, you need to create a QueryParser and pass it a query string, like this:
The syntax you need for s will be:
To search for apples or carrots
See the Lucene Query Parser Syntax for details on constructing s.
Lucene for .net seems to be mature. No need to use Java or SOLR
The Standard query language for Lucene allows equally ranked search terms and negation
So if your Lucene index had a field "tag" your query would be
Which would give equal ranking to each word, and more rank weighting to document with both tags
To negate a tag use this
Simple example to show indexing and querying with Lucene here