List of tuples [{id, []}, {id2, []} ]

2019-08-31 09:07发布

站内文章 / 前沿技术

9 0

叼着烟拽天下

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

The title^ is kinda confusing but I will illustrate what I want to achieve:

I have:

[{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077790705827">>},
     {<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538078530667847">>},
     {<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077778390908">>},
     {<<"5b71d7e458c37fa04a7ce768">>,<<"5bad45b1e990057961313822">>,<<"1538082492283531">>
}]

I want to convert it to a list like this:

[ 
{<<"5b3f77502dfe0deeb8912b42">>,
   [{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077790705827">>},
    {<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538078530667847">>},
    {<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077778390908">>}
   ]},

{<<"5bad45b1e990057961313822">>,
   [{<<"5b71d7e458c37fa04a7ce768">>,<<"5bad45b1e990057961313822">>,<<"1538082492283531">>}
   ]}
]

List of tuples [{id, [<List>]}, {id2, [<List>]} ] where ids are the second item of the tuple of the original list

Example :

<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077790705827">>

回答1:

Erlang newbie here. I created a dict with the second members of the tuples as keys and lists of corresponding tuples as values, then used dict:fold to transform it into the expected output format.

-export([test/0, transform/1]).

transform([H|T]) ->
    transform([H|T], dict:new()).

transform([], D) ->
    lists:reverse(
      dict:fold(fun (Key, Tuples, Acc) ->
                        lists:append(Acc,[{Key,Tuples}])
                end,
                [],
                D));
transform([Tuple={_S1,S2,_S3}|T], D) ->
    transform(T, dict:append_list(S2, [Tuple], D)).

test() ->
    Input=[{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077790705827">>},
           {<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538078530667847">>},
           {<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077778390908">>},
           {<<"5b71d7e458c37fa04a7ce768">>,<<"5bad45b1e990057961313822">>,<<"1538082492283531">>}
          ],
    Output=transform(Input),
    case Output of
        [ 
          {<<"5b3f77502dfe0deeb8912b42">>,
           [{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077790705827">>},
            {<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538078530667847">>},
            {<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077778390908">>}
           ]},
          {<<"5bad45b1e990057961313822">>,
           [{<<"5b71d7e458c37fa04a7ce768">>,<<"5bad45b1e990057961313822">>,<<"1538082492283531">>}
           ]}
        ]     -> ok;
        _Else -> error
    end.

回答2:

I think I see what you're after... Please correct me if I'm wrong.

There are a number of ways to do this, it really just depends on what sort of data structure you're interested in using to check the presence of like-keys. I'll show you two fundamentally different ways to do this and a third hybrid method that has become recently available:

Indexed data types (in this case a map)
List operations with matching
Hybrid matching over map keys

Since you're new I'll use the first case to demonstrate two ways of writing it: explicit recursion and using an actual list function from the lists module.

Indexy Data Types

The first way we'll do this is to use a hash table (aka "dict", "map", "hash", "K/V", etc.) and explicitly recurse through the elements, checking for the presence of the key encountered and adding it if it is missing, or appending to the list of values it points to if it does. We'll use an Erlang map for this. At the end of the function we'll convert the utility map back to a list:

explicit_convert(List) ->
    Map = explicit_convert(List, maps:new()),
    maps:to_list(Map).

explicit_convert([H | T], A) ->
    K = element(2, H),
    NewA =
        case maps:is_key(K, A) of
            true ->
                V = maps:get(K, A),
                maps:put(K, [H | V], A);
            false ->
                maps:put(K, [H], A)
        end,
    explicit_convert(T, NewA);
explicit_convert([], A) ->
    A.

There is nothing wrong with explicit recursion (it is particularly good if you're new, because every part of it is left in the open to be examined), but this is a "left fold" and we already have a library function that abstracts a little bit of the plumbing out. So we really only need to write a function that checks for the presence of an element, and adds the key or appends the value:

fun_convert(List) ->
    Map = lists:foldl(fun convert/2, maps:new(), List),
    maps:to_list(Map).

convert(H, A) ->
    K = element(2, H),
    case maps:is_key(K, A) of
        true ->
            V = maps:get(K, A),
            maps:put(K, [H | V], A);
        false ->
            maps:put(K, [H], A)
    end.

Listy Conversion

The other major way we could have done this is with listy matching. To do that you need to first guarantee that your elements are sorted on the element you want to use as a key so that you can use it as a sort of "working element" and match on it. The code should be pretty easy to understand once you stare at it for a bit (maybe write out how it will step through your list by hand on paper once if you're totally perplexed):

listy_convert(List) ->
    [T = {_, K, _} | Rest] = lists:keysort(2, List),
    listy_convert(Rest, {K, [T]}, []).

listy_convert([T = {_, K, _} | Rest], {K, Ts}, Acc) ->
    listy_convert(Rest, {K, [T | Ts]}, Acc);
listy_convert([T = {_, K, _} | Rest], Done, Acc) ->
    listy_convert(Rest, {K, [T]}, [Done | Acc]);
listy_convert([], Done, Acc) ->
    [Done | Acc].

Note that we split the list immediately after sorting it. The reason is that we have "prime the pump", so to speak, on the first call we make to listy_convert/3. This also means that this function will crash if you pass it an empty list. You can solve that by adding a clause to listy_convert/1 that matches on the empty list [].

A Final Bit of Magic

With those firmly in mind... consider that we also have a bit of a hybrid option available in newer versions of Erlang due to the magical syntax available to maps. We can match (most values) on map keys inside of a case clause (though we can't unify on a key value provided by other arguments within a function head):

map_convert(List) ->
    maps:to_list(map_convert(List, #{})).

map_convert([T = {_, K, _} | Rest], Acc) ->
    case Acc of
        #{K := Ts} -> map_convert(Rest, Acc#{K := [T | Ts]});
        _          -> map_convert(Rest, Acc#{K => [T]})
    end;
map_convert([], Acc) ->
    Acc.

回答3:

Here is a one-liner that would produce your expected result:

[{K, [E || {_, K2, _} = E <- List, K =:= K2]}  || {_, K, _} <- lists:ukeysort(2, List)].

What’s going on here? Let’s do it step by step…

This is your original list

List = […],

lists:ukeysort/2 leaves just one element per key in the list

OnePerKey = lists:ukeysort(2, List),

We then extract the keys with the first list comprehension

Keys = [K || {_, K, _} <- OnePerKey],

With the second list comprehension, we find the elements with the key…

fun Filter(K, List) ->
  [E || {_, K2, _} = E <- List, K =:= K2]
end

Keep in mind that we can’t just pattern-match with K in the generator (i.e. [E || {_, K, _} = E <- List]) because generators in LCs introduce new scope for the variables.

Finally, putting all together…

[{K, Filter(K, List)} || K <- Keys]

回答4:

It really depends on your dataset. For lager data sets using maps is a bit more efficient.

-module(test).
-export([test/3, v1/2, v2/2, v3/2, transform/1, do/2]).


test(N, Keys, Size) ->
    List = [{<<"5b71d7e458c37fa04a7ce768">>,rand:uniform(Keys),<<"1538077790705827">>} || I <- lists:seq(1,Size)],

V1 = timer:tc(test, v1, [N, List]),
V2 = timer:tc(test, v2, [N, List]),
V3 = timer:tc(test, v3, [N, List]),
io:format("V1 took: ~p, V2 took: ~p V3 took: ~p ~n", [V1, V2, V3]).


v1(N, List) when N > 0 ->
  [{K, [E || {_, K2, _} = E <- List, K =:= K2]}  || {_, K, _} <- lists:ukeysort(2, List)],
  v1(N-1, List);
v1(_,_) -> ok.

v2(N, List) when N > 0 ->
  do(List,maps:new()),
  v2(N-1, List);
v2(_,_) -> ok.

v3(N, List) when N > 0 ->
  transform(List),
  v3(N-1, List);
v3(_,_) -> ok.

do([], R) -> maps:to_list(R);

do([H={_,K,_}|T], R) ->
  case maps:get(K,R,null) of
    null -> NewR = maps:put(K, [H], R);
    V -> NewR = maps:update(K, [H|V], R)
  end,
  do(T, NewR).



transform([H|T]) ->
  transform([H|T], dict:new()).

transform([], D) ->
  lists:reverse(
    dict:fold(fun (Key, Tuples, Acc) ->
                    lists:append(Acc,[{Key,Tuples}])
            end,
            [],
            D));
transform([Tuple={_S1,S2,_S3}|T], D) ->
  transform(T, dict:append_list(S2, [Tuple], D)).

Running both with 100 unique keys and 100,000 records I get:

> test:test(1,100,100000).
V1 took: {75566,ok}, V2 took: {32087,ok} V3 took: {887362,ok} 
ok

标签： erlang

叼着烟拽天下

女 | 书童

私信

收藏的人(0)

Ta的文章更多文章

0条评论

还没有人评论过~

List of tuples [{id, []}, {id2, []} ]

问题:

回答1:

回答2:

回答3:

回答4:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮