Erlang sublist function performance

I'm reading http://learnyousomeerlang.com/ which includes a tail recursive sublist function that reverses the list to preserve order. I have written an alternative that does not require the reverse call. Is mine more efficient (it is certainly more verbose, but I'm not concerned with that), or have I overlooked something?

-module(sublist).

-export([sublist/2,sublistR/2]).

-include_lib("eunit/include/eunit.hrl").

sublist(_,0) ->
    [];
sublist([],_) ->
    [];
sublist(List,Length) ->
    sublist([hd(List)], tl(List), Length-1).

sublist(Acc,[],_) ->
    Acc;
sublist(Acc,_,0) ->
    Acc;
sublist(Acc,Tail,Length) ->
    sublist(Acc++[hd(Tail)], tl(Tail), Length-1).

sublistR(L, N) -> lists:reverse(sublistR(L, N, [])).

sublistR(_, 0, SubList) -> SubList;
sublistR([], _, SubList) -> SubList;
sublistR([H|T], N, SubList) when N > 0 ->
    sublistR(T, N-1, [H|SubList]).

sublist_test() ->
    sublisttest(fun sublist:sublist/2),
    sublisttest(fun sublist:sublistR/2).

sublisttest(SublistFunc) ->
    [] = SublistFunc([],10),
    [] = SublistFunc([1,2,3], 0),
    [1,2,3] = SublistFunc([1,2,3],3),
    [1,2,3] = SublistFunc([1,2,3],4),
    [1,2] = SublistFunc([1,2,3],2).

标签： erlang

1条回答

Root（大扎）

2楼-- · 2019-05-25 07:22

No. But don't feel bad, this is a common thing everyone has to come to terms with early on.

Its all about this statement:

Acc ++ [hd(Tail)]

Let's say Acc = [1,2,3] is true. Then Acc ++ [hd(Tail)] is directly equivalent to [1,2,3 | [Head]]. What does this mean?

In this particular case it means it is exactly the same as writing out every element in Acc as the car and using the result of [hd(Tail)] as the cdr in a new cons. This means to join a single element to the end of an existing list, the entire list must be traversed (for deconstruction, to expose the final cdr). On the other hand, adding a single element to the beginning of a list is a simple operation.

The magic in the use of lists:reverse/1 (and the entire reason this is idiomatic in languages that use cons style lists) is that it is an array operation implemented in C, not an expansion in the object language you're writing right then.

This issue of access to the last element being expensive VS access to the first element is the reason hd/1 (or head() in other languages) returns the first element of a list, but tl/1 (or tail() in other languages) returns everything but the first element of a list. Uses of a function guaranteed to access the last element of a list are usually surrounded by warnings that "this is slow!".

This is also why you will very rarely see anyone use a foldr in almost any language. If they really need to fold from the right (say, because of a non-commutative operation) they will usually do lists:reverse/1 first, then call foldl. foldl can be tail-recursive, but a true foldr cannot.

Actually, the Erlang documentation has a mention of this: http://www.erlang.org/doc/man/lists.html#foldl-3

The cost of folding from the right increases as the length of the list increases. That is why this happens:

1> AMillionThings = lists:seq(1, 1000000).
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
 23,24,25,26,27,28,29|...]
2> OneThing = [0].
[0]
3> timer:tc(fun() -> OneThing ++ AMillionThings end).
{13,
 [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
  23,24,25,26|...]}
4> timer:tc(fun() -> AMillionThings ++ OneThing end).
{31291,
 [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
  23,24,25,26,27|...]}

13μs vs 31291μs. Ouch.

Also see:

0人赞添加讨论(0) 举报

Erlang sublist function performance

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间