consider I have a
string1 = "hello hi goodmorning evening [...]"
and I have some minor keywords
compare1 = "hello evening"
compare2 = "hello hi"
I need a function that returns the affinity between the text and keywords. Example:
function(string1,compare1); // returns: 4
function(string1,compare2); // returns: 5 (more relevant)
Please note 5 and 4 are just for example.
You could say - write a function that counts occurrences - but for this example this would not work because both got 2 occurrences, but compare1 is less relevant because "hello evening" isn't exactly found in string1 (the 2 words hello and evening are more distant than hello hi)
are there any known-algorithm to do this?
ADD1:
algos like Edit Distance in this case would NOT work. Because string1 is a complete text (like 300-400 words) and the comparing strings are max 4-5 word.
I think there is a pretty good and complete answer to this question here http://answers.google.com/answers/threadview?id=337832
Sorry its on google answers!