I wanted to create the function genAllSize ::[a] -> [[a]]
, that receives a list l
and generates all the lists sorted by size that can be built with the elements of the list l
; i.e.
> genAllSize [2,4,8]
[[],[2],[4],[8],[2,2],[4,2],[8,2],[2,4],[4,4],[8,4],[2,8],[4,8],[8,8],[2,2,2],[4,2,2],[8,2,2], ...
How would you do it? I came up with a solution using permutations from Data.List
but I do not want to use it.
- Given an input list
xs
, select a prefix of that in a non deterministic way
- For each element in the prefix, replace it with any element of
xs
, in a non deterministic way
Result:
> xs = [2,4,8]
> inits xs >>= mapM (const xs)
[[],[2],[4],[8],[2,2],[2,4],[2,8],[4,2],[4,4],[4,8],[8,2],[8,4],
[8,8],[2,2,2],[2,2,4],[2,2,8],[2,4,2],[2,4,4],[2,4,8],[2,8,2],
[2,8,4],[2,8,8],[4,2,2],[4,2,4],[4,2,8],[4,4,2],[4,4,4],[4,4,8],
[4,8,2],[4,8,4],[4,8,8],[8,2,2],[8,2,4],[8,2,8],[8,4,2],[8,4,4],
[8,4,8],[8,8,2],[8,8,4],[8,8,8]]
Hmm I guess you a need a lazy infinite list of cycling subsequences. One naive way could be like
Prelude> take 100 $ nub . subsequences . cycle $ [2,4,8]
[[],[2],[4],[2,4],[8],[2,8],[4,8],[2,4,8],[2,2],[4,2],[2,4,2],[8,2],[2,8,2],[4,8,2],[2,4,8,2],[4,4],[2,4,4],[8,4],[2,8,4],[4,8,4],[2,4,8,4],[2,2,4],[4,2,4],[2,4,2,4],[8,2,4],[2,8,2,4],[4,8,2,4],[2,4,8,2,4],[8,8],[2,8,8],[4,8,8],[2,4,8,8],[2,2,8],[4,2,8],[2,4,2,8],[8,2,8],[2,8,2,8],[4,8,2,8],[2,4,8,2,8],[4,4,8],[2,4,4,8],[8,4,8],[2,8,4,8],[4,8,4,8],[2,4,8,4,8],[2,2,4,8],[4,2,4,8],[2,4,2,4,8],[8,2,4,8],[2,8,2,4,8],[4,8,2,4,8],[2,4,8,2,4,8],[2,2,2],[4,2,2],[2,4,2,2],[8,2,2],[2,8,2,2],[4,8,2,2],[2,4,8,2,2],[4,4,2],[2,4,4,2],[8,4,2],[2,8,4,2],[4,8,4,2],[2,4,8,4,2],[2,2,4,2],[4,2,4,2],[2,4,2,4,2],[8,2,4,2],[2,8,2,4,2],[4,8,2,4,2],[2,4,8,2,4,2]]
A simple and highly efficient option:
genAllSize [] = [[]]
genAllSize [a] = iterate (a:) []
genAllSize xs =
[] : [x:q|q<-genAllSize xs,x<-xs]
(Thanks to Will Ness for a small but very nice simplification.)
This solution takes advantage of the fact that a valid solution list is either empty or an element of the argument list consed onto a shorter valid solution list. Unlike Daniel Wagner's solution, this one doesn't resort to counting. My tests suggest that it performs extremely well under typical conditions.
Why do we need a special case for a one-element list? The general case performs extremely badly for that, because it maps over the same list over and over with no logarithmic slowdown.
But what's the deal with that call to genAllSizes
with the very same argument? Wouldn't it be better to save the result to increase sharing?
genAllSize [] = [[]]
genAllSize xs = p
where
p = [] : [x:q|q<-p,x<-xs]
Indeed, on a theoretical machine with unlimited constant-time memory, this is optimal: walking the list takes worst-case O(1) time for each cons. In practice, it's only a good idea if a great many entries will be realized and retained. Otherwise, there's a problem: most of the list entries will be retained indefinitely, dramatically increasing memory residency and the amount of work the garbage collector needs to do. The non-bold sharing version above still offers amortized O(1) time per cons, but it needs very little memory (logarithmic rather than linear).
Examples
genAllSize "ab" =
["","a","b","aa","ba"
,"ab","bb","aaa","baa"
,"aba","bba","aab","bab"
,"abb","bbb","aaaa",...]
genAllSize "abc" =
["","a","b","c","aa","ba"
,"ca","ab","bb","cb","ac"
,"bc","cc","aaa","baa"
,"caa","aba","bba","cba"
,"aca",...]
An explicit option
You can also use two accumulators:
genAllSize [] = [[]]
genAllSize [a] = iterate (a:) []
genAllSize (x:xs) = go ([], []) where
go (curr, remain) = curr : go (step curr remain)
step [] [] = ([x], [xs])
step (_:ls) ((r:rs):rss) =
(r:ls, rs:rss)
step (_:ls) ([] : rs) =
(x : ls', xs : rs')
where
!(ls', rs') = step ls rs
This version keeps track of the current "word" and also the remaining available "letters" in each position. The performance seems comparable in general, but a bit better with regard to memory residency. It's also much harder to understand!