functional programming algorithm for finding repea

2019-05-22 21:25发布

问题:

I am converting the "zxcvbn" password strength algorithm from JavaScript to Scala. I am looking for a pure functional algorithm for finding sequences of repeating characters in a string.

I know that I can translate the imperative version from JavaScript, but I would like to keep this as side-effect free as possible, for all the reasons usually given for functional programming.

The algorithm can be in Scala, Clojure, Haskell, F#, or even pseudocode.

Thanks.

回答1:

Using Haskell's standard higher-order functions:

  • Data.List.group finds runs of equal elements in a list:

    > group "abccccdefffg"
    ["a","b","cccc","d","e","fff","g"]
    
  • We care about the length of these runs, not the elements themselves:

    > let ls = map length $ group "abccccdefffg"
    > ls
    [1,1,4,1,1,3,1]
    
  • Next, we need the starting positions of each group. This is just the partial sums of the group lengths, which we can compute using scanl:

    > scanl (+) 0 ls
    [0,1,2,6,7,8,11,12]
    
  • Zipping these two lists gives us all pairs of starting positions and corresponding lengths:

    > zip (scanl (+) 0 ls) ls
    [(0,1),(1,1),(2,4),(6,1),(7,1),(8,3),(11,1)]
    
  • Finally, we remove the groups of length less than 3.

    > filter ((>= 3) . snd) $ zip (scanl (+) 0 ls) ls
    [(2,4),(8,3)]
    

Putting this together:

import Data.List (group)

findRuns :: Eq a => [a] -> [(Int, Int)]
findRuns xs = filter ((>= 3) . snd) $ zip (scanl (+) 0 ls) ls 
  where ls = map length $ group xs


回答2:

Here is another Scala solution

def findRuns(password: String): List[(Int, Int)] = {
  val zipped = password.zipWithIndex
  val grouped = zipped groupBy { case (c, i) => c }
  val filtered = grouped.toList.filter{ case (c, xs) => xs.length >= 3 }
  val mapped = filtered map { case (c, xs) => xs }
  val result = mapped map (xs => (xs.head._2, xs.length))
  result.sorted
}


回答3:

Clojure's version of hammar's algorithm

 (defn find-runs [s]
    (let [ls (map count (partition-by identity s))]
        (filter #(>= (% 1) 3) 
                (map vector (reductions + 0 ls) ls))))


回答4:

Here is a Scala solution that I came up with:

def repeatMatch(password: String): List[(Int, Int)] = {
  val length = password.length

  @tailrec
  def loop(offset: Int,
           result: List[(Int, Int)]): List[(Int, Int)] = {
    if (offset >= length) result.reverse
    else {
      val candidate = password.charAt(offset)
      val run = password.substring(offset).zip(Array.fill(length)(candidate)).
        takeWhile{case (first, second) => first == second}.length
      if (run > 2) loop(offset + run, (offset, run) :: result)
      else loop(offset + 1, result)
    }
  }

  loop(0, List.empty[(Int, Int)])
}

For the test case repeatMatch("abccccdefffg"), the result is List((2,4), (8,3))

Maybe the calculation of run could be improved.