c# regex how to match user's input to an array

2019-08-08 06:28发布

I have an array with different words and phrases. The user will input a spam message and I'm supposed to check whether there are any matches to the words and phrases already in the array. For each match the score will +1 and if the score is more than 5 then the possibility of it being a spam message is Yes.

My score doesn't increase though and I'm not sure why.

string[] spam = new string[] {"-different words and phrases provided by programmer"};

        Console.Write("Key in an email message: ");
        string email = Console.ReadLine();
        int score = 0;

        string pattern = "^\\[a-zA-Z]";
        Regex expression = new Regex(pattern);
        var regexp = new System.Text.RegularExpressions.Regex(pattern);

        if (!regexp.IsMatch(email))
        {
            score += 1;
        }

2条回答
不美不萌又怎样
2楼-- · 2019-08-08 06:54
 static void Main(string[] args)
            {
                string[] spam = new string[] { "test", "ak", "admin", "againadmin" };
                string email = "Its great to see that admin ak is not perfroming test.";
                string email1 = "Its great to see that admin ak is not perfroming test againadmin.";

                if (SpamChecker(spam, email))
                {
                    Console.WriteLine("email spam");
                }
                else 
                {
                    Console.WriteLine("email not spam");
                }

                if (SpamChecker(spam, email1))
                {
                    Console.WriteLine("email1 spam");
                }
                else
                {
                    Console.WriteLine("email1 not spam");
                }

                Console.Read();
            }

            private static bool SpamChecker(string[] spam, string email)
            {
                int score = 0;
                foreach (var item in spam)
                {
                    score += Regex.Matches(email, item, RegexOptions.Compiled | RegexOptions.IgnoreCase).Count;
                    if (score > 3) // change count as per desired count
                    {
                        return true;
                    }
                }

                return false;
            }
查看更多
看我几分像从前
3楼-- · 2019-08-08 06:56

You can use Linq to solve the problem

  // HashSet<String> is for better performance
  HashSet<String> spamWords = new HashSet<String>(
    "different words and phrases provided by programmer"
      .Split(new Char[] {' '}, StringSplitOptions.RemoveEmptyEntries)
      .Select(word => word.ToUpper()));

  ...

  String eMail = "phrases, not words and letters zzz";

  ... 

  // score == 3: "phrases" + "words" + "and"
  int score = Regex
    .Matches(eMail, @"\w+")
    .OfType<Match>()
    .Select(match => match.Value.ToUpper())
    .Sum(word => spamWords.Contains(word) ? 1 : 0);

In this implementation I'm looking for spam words in case insensitive manner (so And, and, AND will be count as spam words). To take plurals, ings (i.e. word, wording) into account you have to use stemmer.

查看更多
登录 后发表回答