I am converting the "zxcvbn" password strength algorithm from JavaScript to Scala. I am looking for a pure functional algorithm for finding sequences of repeating characters in a string.
I know that I can translate the imperative version from JavaScript, but I would like to keep this as side-effect free as possible, for all the reasons usually given for functional programming.
The algorithm can be in Scala, Clojure, Haskell, F#, or even pseudocode.
Thanks.
Using Haskell's standard higher-order functions:
Data.List.group
finds runs of equal elements in a list:
> group "abccccdefffg"
["a","b","cccc","d","e","fff","g"]
We care about the length of these runs, not the elements themselves:
> let ls = map length $ group "abccccdefffg"
> ls
[1,1,4,1,1,3,1]
Next, we need the starting positions of each group. This is just the partial sums of the group lengths, which we can compute using scanl
:
> scanl (+) 0 ls
[0,1,2,6,7,8,11,12]
Zipping these two lists gives us all pairs of starting positions and corresponding lengths:
> zip (scanl (+) 0 ls) ls
[(0,1),(1,1),(2,4),(6,1),(7,1),(8,3),(11,1)]
Finally, we remove the groups of length less than 3.
> filter ((>= 3) . snd) $ zip (scanl (+) 0 ls) ls
[(2,4),(8,3)]
Putting this together:
import Data.List (group)
findRuns :: Eq a => [a] -> [(Int, Int)]
findRuns xs = filter ((>= 3) . snd) $ zip (scanl (+) 0 ls) ls
where ls = map length $ group xs
Here is another Scala solution
def findRuns(password: String): List[(Int, Int)] = {
val zipped = password.zipWithIndex
val grouped = zipped groupBy { case (c, i) => c }
val filtered = grouped.toList.filter{ case (c, xs) => xs.length >= 3 }
val mapped = filtered map { case (c, xs) => xs }
val result = mapped map (xs => (xs.head._2, xs.length))
result.sorted
}
Clojure's version of hammar's algorithm
(defn find-runs [s]
(let [ls (map count (partition-by identity s))]
(filter #(>= (% 1) 3)
(map vector (reductions + 0 ls) ls))))
Here is a Scala solution that I came up with:
def repeatMatch(password: String): List[(Int, Int)] = {
val length = password.length
@tailrec
def loop(offset: Int,
result: List[(Int, Int)]): List[(Int, Int)] = {
if (offset >= length) result.reverse
else {
val candidate = password.charAt(offset)
val run = password.substring(offset).zip(Array.fill(length)(candidate)).
takeWhile{case (first, second) => first == second}.length
if (run > 2) loop(offset + run, (offset, run) :: result)
else loop(offset + 1, result)
}
}
loop(0, List.empty[(Int, Int)])
}
For the test case repeatMatch("abccccdefffg")
, the result is List((2,4), (8,3))
Maybe the calculation of run
could be improved.