I'm reading http://learnyousomeerlang.com/ which includes a tail recursive sublist function that reverses the list to preserve order. I have written an alternative that does not require the reverse call. Is mine more efficient (it is certainly more verbose, but I'm not concerned with that), or have I overlooked something?
-module(sublist).
-export([sublist/2,sublistR/2]).
-include_lib("eunit/include/eunit.hrl").
sublist(_,0) ->
[];
sublist([],_) ->
[];
sublist(List,Length) ->
sublist([hd(List)], tl(List), Length-1).
sublist(Acc,[],_) ->
Acc;
sublist(Acc,_,0) ->
Acc;
sublist(Acc,Tail,Length) ->
sublist(Acc++[hd(Tail)], tl(Tail), Length-1).
sublistR(L, N) -> lists:reverse(sublistR(L, N, [])).
sublistR(_, 0, SubList) -> SubList;
sublistR([], _, SubList) -> SubList;
sublistR([H|T], N, SubList) when N > 0 ->
sublistR(T, N-1, [H|SubList]).
sublist_test() ->
sublisttest(fun sublist:sublist/2),
sublisttest(fun sublist:sublistR/2).
sublisttest(SublistFunc) ->
[] = SublistFunc([],10),
[] = SublistFunc([1,2,3], 0),
[1,2,3] = SublistFunc([1,2,3],3),
[1,2,3] = SublistFunc([1,2,3],4),
[1,2] = SublistFunc([1,2,3],2).
No. But don't feel bad, this is a common thing everyone has to come to terms with early on.
Its all about this statement:
Let's say
Acc = [1,2,3]
is true. ThenAcc ++ [hd(Tail)]
is directly equivalent to[1,2,3 | [Head]]
. What does this mean?In this particular case it means it is exactly the same as writing out every element in
Acc
as the car and using the result of[hd(Tail)]
as the cdr in a new cons. This means to join a single element to the end of an existing list, the entire list must be traversed (for deconstruction, to expose the final cdr). On the other hand, adding a single element to the beginning of a list is a simple operation.The magic in the use of
lists:reverse/1
(and the entire reason this is idiomatic in languages that use cons style lists) is that it is an array operation implemented in C, not an expansion in the object language you're writing right then.This issue of access to the last element being expensive VS access to the first element is the reason
hd/1
(orhead()
in other languages) returns the first element of a list, buttl/1
(ortail()
in other languages) returns everything but the first element of a list. Uses of a function guaranteed to access the last element of a list are usually surrounded by warnings that "this is slow!".This is also why you will very rarely see anyone use a
foldr
in almost any language. If they really need to fold from the right (say, because of a non-commutative operation) they will usually dolists:reverse/1
first, then callfoldl
.foldl
can be tail-recursive, but a truefoldr
cannot.Actually, the Erlang documentation has a mention of this: http://www.erlang.org/doc/man/lists.html#foldl-3
The cost of folding from the right increases as the length of the list increases. That is why this happens:
13μs vs 31291μs. Ouch.
Also see: