High performance “contains” search in list of stri

2019-01-14 03:20发布

I have a list of approx. 500,000 strings, each approx. 100 characters long. Given a search term, I want to identify all strings in the list that contain the search term. At the moment I am doing this with a plain old dataset using the Select method ("MATCH %term%"). This takes about 600ms on my laptop. I'd like to make it faster, maybe 100-200ms.

What would be a recommended approach?

Performance is critical so I can trade memory footprint for better performance if necessary (within reason). The list of strings will not change once initialised so calculating hashes would also be an option.

Does anyone have a recommendation and which C# data structures are best suited to the task?

7条回答
何必那么认真
2楼-- · 2019-01-14 04:19
public static bool ContainsFast<T>(this IList<T> list, T item)
{
    return list.IndexOf(item) >= 0;
}

Base on tests that I did, this variation of Contains was about 33% faster on my side.

查看更多
登录 后发表回答