search List for string .StartsWith()

2020-04-07 04:28发布

I have a

List<string>

with 1500 strings. I am now using the following code to pull out only string that start with the string prefixText.

foreach(string a in <MYLIST>)
{            
    if(a.StartsWith(prefixText, true, null))
    {
        newlist.Add(a);                   
    }            
}

This is pretty fast, but I'm looking for google fast. Now my question is if I arrange the List in alphabetical order, then compare char by char can I make this faster? Or any other suggestions on making this faster?

10条回答
beautiful°
2楼-- · 2020-04-07 05:28

If you have the list in alpabetical order, you can use a variation of binary search to make it a lot faster.

As a starting point, this will return the index of one of the strings that match the prefix, so then you can look forward and backward in the list to find the rest:

public static int BinarySearchStartsWith(List<string> words, string prefix, int min, int max) {
  while (max >= min) {
    int mid = (min + max) / 2;
    int comp = String.Compare(words[mid].Substring(0, prefix.Length), prefix);
    if (comp < 0) {
      min = mid + 1;
    } else if (comp > 0) {
      max = mid - 1;
    } else {
      return mid;
    }
  }
  return -1;
}

int index = BinarySearchStartsWith(theList, "pre", 0, theList.Count - 1);
if (index == -1) {
  // not found
} else{
  // found
}

Note: If you use a prefix that is longer than any of the strings that are compared, it will break, so you might need to figure out how you want to handle that.

查看更多
仙女界的扛把子
3楼-- · 2020-04-07 05:28

So many approches were analyzed to achive minimum data capacity and high performance. The first place is: all prefixes are stored in dictionary: key - prefix, values - items appropriate for prefix.

Here simple implementation of this algorithm:

public class Trie<TItem>
{
    #region Constructors

    public Trie(
        IEnumerable<TItem> items,
        Func<TItem, string> keySelector,
        IComparer<TItem> comparer)
    {
        this.KeySelector = keySelector;
        this.Comparer = comparer;
        this.Items = (from item in items
                      from i in Enumerable.Range(1, this.KeySelector(item).Length)
                      let key = this.KeySelector(item).Substring(0, i)
                      group item by key)
                     .ToDictionary( group => group.Key, group => group.ToList());
    }

    #endregion

    #region Properties

    protected Dictionary<string, List<TItem>> Items { get; set; }

    protected Func<TItem, string> KeySelector { get; set; }

    protected IComparer<TItem> Comparer { get; set; }

    #endregion

    #region Methods

    public List<TItem> Retrieve(string prefix)
    {
        return  this.Items.ContainsKey(prefix)
            ? this.Items[prefix]
            : new List<TItem>();
    }

    public void Add(TItem item)
    {
        var keys = (from i in Enumerable.Range(1, this.KeySelector(item).Length)
                    let key = this.KeySelector(item).Substring(0, i)
                    select key).ToList();
        keys.ForEach(key =>
        {
            if (!this.Items.ContainsKey(key))
            {
                this.Items.Add(key, new List<TItem> { item });
            }
            else if (this.Items[key].All(x => this.Comparer.Compare(x, item) != 0))
            {
                this.Items[key].Add(item);
            }
        });
    }

    public void Remove(TItem item)
    {
        this.Items.Keys.ToList().ForEach(key =>
        {
            if (this.Items[key].Any(x => this.Comparer.Compare(x, item) == 0))
            {
                this.Items[key].RemoveAll(x => this.Comparer.Compare(x, item) == 0);
                if (this.Items[key].Count == 0)
                {
                    this.Items.Remove(key);
                }
            }
        });
    }

    #endregion
}
查看更多
唯我独甜
4楼-- · 2020-04-07 05:31

Thus 1500 is not really a huge number binary search on sorted list would be enough probably. Nevertheless most efficient algorithms for prefix search are based on the data structure named Trie or Prefix Tree. See: http://en.wikipedia.org/wiki/Trie

Following picture demonstrates the idea very briefly: enter image description here

For c# implementation see for instance .NET DATA STRUCTURES FOR PREFIX STRING SEARCH AND SUBSTRING (INFIX) SEARCH TO IMPLEMENT AUTO-COMPLETION AND INTELLI-SENSE

查看更多
孤傲高冷的网名
5楼-- · 2020-04-07 05:33

Have you tried implementing a Dictionary and comparing the results? Or, if you do put the entries in alphabetical order, try a binary search.

查看更多
登录 后发表回答