Erlang: check duplicate inserted elements

2019-07-25 19:40发布

问题:

I want to know if inserted elements are duplicated.

Here is simple example for what I'm looking for :

In first run should return false.

check_duplicate("user", "hi").

But in second run should return true.

check_duplicate("user", "hi").

回答1:

One of best features of functional programming is pure functions. There are even functional languages like Haskell where you can't write an impure function. A pure function always returns the same value for the same argument. An impure function has side effect and can return different result for the same argument. It means there has to change some state which you can't see as an argument to the function. You are asking just for it. Erlang allows you to do it. You have many options how to do it. The cleanest is to send a message and receive a message from another process. (It's impure anyway, but idiomatic in Erlang. The following code is very simple and not ready for production use. You should use OTP behaviours and design principles for it.)

has_dupes(Jid, Text) ->
    Ref = make_ref(),
    seen ! {Ref, self(), {Jid, Text}},
    receive {Ref, Result} -> Result end.

start_seen() ->
    spawn(fun()-> register(seen, self()), loop_seen([]) end).

loop_seen(Seen) ->
    receive {Ref, From, Term} ->
        case lists:member(Term, Seen) of
            true  ->
                From ! {Ref, true},
                loop_seen(Seen);
            false ->
                From ! {Ref, false},
                loop_seen([Term|Seen])
        end
    end.

The other is to store and read from ets (Erlang Term Storage).

has_dupes(Jid, Text) ->
    (catch ets:new(seen, [set, named_table])),
    not ets:insert_new(seen, {{Jid, Text}}).

But there is a catch. The table is owned by the process and is deleted when the process dies. Its name is global and so on. Another one and much more dirty is to store and read a value from process dictionary.

has_dupes(Jid, Text) ->
    case get({Jid, Text}) of
        undefined ->
            put({Jid, Text}, seen),
            false;
        seen ->
            true
    end.

But it is nasty and you should almost never use code like this. In most cases you should use explicit state

new_seen() -> [].

has_dupes(Jid, Text, Seen) ->
    Term = {Jid, Text},
    case lists:member(Term, Seen) of
        true  -> {true, Seen};
        false -> {false, [Term|Seen]}
    end.

It is most time best solution because it is a pure function. You can use better data structures like sets and maps for better performance when you need to watch a bigger amount of terms.



标签: erlang erl