I have a long, lazy sequence that I want to reduce and test lazily. As soon as two sequential elements are not =
(or some other predicate) to each other, I want to stop consuming the list, which is expensive to produce. Yes, this sounds like take-while
, but read further.
I wanted to write something simple and elegant like this (pretending for a minute that every?
works like reduce
):
(every? = (range 100000000))
But that does not work lazily and so it hangs on infinite seqs. I discovered that this works almost as I wanted:
(apply = (range 100000000))
However, I noticed that sequence chunking was resulting in extra, unnecessary elements being created and tested. At least, this is what I think this is what happening in the following bit of code:
;; Displays chunking behavior in groups of four on my system and prints 1 2 3 4
(apply = (map #(do (println %) %) (iterate inc 1)))
;; This prints 0 to 31
(apply = (map #(do (println %) %) (range)))
I found a workaround using take-while
, and count
to check the number of elements taken, but that is rather cumbersome.
Should I politely suggest to Rich Hickey that he make some combination of reduce
and every?
short circuit properly, or am I missing some obvious way that already exists?
EDIT: Two kind people posted solutions for avoiding chunking on the lazy sequences, but how do I avoid chunking when doing the apply
, which seems to be consuming in chunked groups of four?
EDIT #2: As Stuart Sierra notes and I discovered independently, this isn't actually chunking. It's just apply acting normally, so I'll call this closed and give him the answer. I included a small function in a separate answer to do the reduce'ing part of the problem, for those who are interested.
CORRECTED TWICE: A simpler way to un-chunk a lazy sequence:
First version omitted
(when ...
so it returned an infinite seq of nil's after the input sequence ended.Second version used
first
instead ofseq
so it stopped on nil.RE: your other question, "how do I avoid chunking when doing the apply, which seems to be consuming in chunked groups of four":
This is due to the definition of
=
, which, when given a sequence of arguments, forces the first 4:Looking in clojure.core at the definition of apply makes it obvious as to why it was chunking in groups of four when
apply
is used with an infinite sequence.Reduce
doesn't short circuit, either...so I'm left writing my own solution:Then using Stuart's unchunk (with an extra
and
)I get:
If that works for you too, mod this up.
EDIT: Stuart's modified version of unchunk after his edit is probably preferable to the one in this earlier post.
I found this post while hitting a time limit on a 4clojure problem and I found another way to avoid the 32-chunks:
The higher arity forms of map don't appear to use chunked sequences (clojure 1.5)
You have to do something with the second parameter, so being explicit about that might be better:
This isn't as neat as the other solutions but might be good for quick and dirty uses, like testing "is this slow because of chunking?".
Regarding
apply
, you could usepartition
to get pairs from the sequence and = them:Although
reducep
looks useful too.PS. I don't want to give the impression that the chunked sequence is slower, it's not. My issue is that the 4clojure test case calls "first" on my seq generating function over a range of values, so chunking means I do 32 times the work. (PPS. My code is still too slow)