F# Count how Many times a substring Contains withi

2019-09-07 04:37发布

问题:

How could one count how many times a substring exists within a string?

I mean if you have a String "one, two, three, one, one, two" how could you make it count "one" being present 3 times?

I thought String.Contains would be able to do the job but that only checks if the substring is present at all. String.forall is for chars and therefofre niether an option.

So i am really at a complete halt here. Can some enligten me?

回答1:

You can use Regex.Escape to turn the string you're searching for into a regex, then use regex functions:

open System.Text.RegularExpressions

let countMatches wordToMatch (input : string) =
    Regex.Matches(input, Regex.Escape wordToMatch).Count

Test:

countMatches "one" "one, two, three, one, one, two"
// Output: 3


回答2:

Here's a simple implementation that walks through the string, using String.IndexOf to skip through to the next occurrence of the substring, and counts up how many times it succeeds.

let substringCount (needle : string) (haystack : string) =
    let rec loop count (index : int) =
        if index >= String.length haystack then count
        else
            match haystack.IndexOf(needle, index) with
            | -1 -> count
            | idx -> loop (count + 1) (idx + 1)
    if String.length needle = 0 then 0 else loop 0 0

Bear in mind, this counts overlapping occurrences, e.g., subtringCount "aa" "aaaa" = 3. If you want non-overlapping, simply replace idx + 1 with idx + String.length needle.



回答3:

Create a sequence of tails of the string to search in, that is, all substring slices anchored at its end. Then you can use forall functionality to determine the number of matches against the beginning of each of them. It's just golfier than (fun s -> s.StartsWith needle).

let count needle haystack =
    [ for i in 0..String.length haystack - 1 -> haystack.[i..] ]
    |> Seq.filter (Seq.forall2 (=) needle)
    |> Seq.length

count "aba" "abacababac"
// val it : int = 3


回答4:

a fellow student of mine came up with the so far simpelst solutions i have seen.

let countNeedle (haystack :string) (needle : string) =
      match needle with
      | "" -> 0
      | _ -> (haystack.Length - haystack.Replace(needle, "").Length) / needle.Length


回答5:

// This approach assumes the data is comma-delimited.
let data = "one, two, three, one, one, two"
let dataArray = data.Split([|','|]) |> Array.map (fun x -> x.Trim())
let countSubstrings searchTerm = dataArray |> Array.filter (fun x -> x = searchTerm) |> Array.length
let countOnes = countSubstrings "one"

let data' = "onetwothreeoneonetwoababa"

// This recursive approach makes no assumptions about a delimiter,
// and it will count overlapping occurrences (e.g., "aba" twice in "ababa").
// This is similar to Jake Lishman's answer.
let rec countSubstringFromI s i what = 
    let len = String.length what
    if i + len - 1 >= String.length s then 0
    else (if s.Substring(i, len) = what then 1 else 0) + countSubstringFromI s (i + 1) what

let countSubStrings' = countSubstringFromI data' 0 "one"