How to use LINQ to SQL to create ranked search res

2019-05-22 12:30发布

I am looking for a way to use l2s to return ranked result based on keywords.

I would like to take a keyword and be able to search the table for that keyword using .contains(). The trick that I haven't been able to figure out is how to get a count of how many times that keyqord appears, and then .OrderByDescending() based on that count.

So if i had some thing like:

string keyword = "SomeKeyword";

IQueryable<Article> searchResults = from a in GenesisRepository.Article
                                    where a.Body.Contains(keyword)
                                    select a;

What is the best way to order searchResults based on the number of times keyword appears in a.Body?

Thanks for any help.

3条回答
走好不送
2楼-- · 2019-05-22 12:46

try inserting order by a.Body.Split(' ').Count(w=>w == keyword). That should allow you to see that the concept works. However, I STRONGLY recommend that the final version include this as part of the select projection, possibly using a key-value pair, and order by the property name:

string keyword = "SomeKeyword";

//EDIT: restructured query to force the ordering to be done on the projection, 
//not the source.
IQueryable<Article> searchResults = (from a in GenesisRepository.Article
                                    where a.Body.Contains(keyword)
                                    select new KeyValuePair<int, Article>(
                                       a.Body.Split(' ').Count(w=>w == keyword), a))
                                    .OrderBy(kvp=>kvp.Key);

The reason is performance; the Split().Count() method chain is linear-complexity, and will be evaluated for every comparison of two values, making the overall sort N^2logN complexity (slow).

EDIT: Also, understand that a.Body.Contains(keyword) will not search by whole words, and so will return articles that contain "SomeKeywordLongerThanSearch" and "ThisIsSomeKeyword" as well as "SomeKeyword". You can avoid this with a Regex match on the pattern "\bSomeKeyword\b", which will only match instances of SomeKeyword with a word boundary immediately before and after.

查看更多
倾城 Initia
3楼-- · 2019-05-22 12:51

This is a little hack I came up with, pretty simple but definitely not a "best practices" one.

IQueryable<Article> searchResults = from a in GenesisRepository.Article
                       where a.Body.Contains(keyword)
                       orderby a.Body.Split(new string[] { keyword }, StringSplitOptions.RemoveEmptyEntries).Count() descending
                       select a;
查看更多
Emotional °昔
4楼-- · 2019-05-22 12:57

Maybe this will work...

IQueryable<Article> searchResults = from a in GenesisRepository.Article
                                        where a.Body.Contains(keyword)
                                        select a;

searchResults.OrderByDescending(s => Regex.Matches(a.Body, keyword).Count);
查看更多
登录 后发表回答