How can i split a binary in erlang

2019-02-06 18:11发布

问题:

What I want is, I think, relatively simple:

> Bin = <<"Hello.world.howdy?">>.
> split(Bin, ".").
[<<"Hello">>, <<"world">>, <<"howdy?">>]

Any pointers?

回答1:

There is no current OTP function that is the equivalent of lists:split/2 that works on a binary string. Until EEP-9 is made public, you might write a binary split function like:

split(Binary, Chars) ->
    split(Binary, Chars, 0, 0, []).

split(Bin, Chars, Idx, LastSplit, Acc)
  when is_integer(Idx), is_integer(LastSplit) ->
    Len = (Idx - LastSplit),
    case Bin of
        <<_:LastSplit/binary,
         This:Len/binary,
         Char,
         _/binary>> ->
            case lists:member(Char, Chars) of
                false ->
                    split(Bin, Chars, Idx+1, LastSplit, Acc);
                true ->
                    split(Bin, Chars, Idx+1, Idx+1, [This | Acc])
            end;
        <<_:LastSplit/binary,
         This:Len/binary>> ->
            lists:reverse([This | Acc]);
        _ ->
            lists:reverse(Acc)
    end.


回答2:

binary:split(Bin,<<".">>).


回答3:

The module binary from EEP31 (and EEP9) was added in Erts-5.8 (see OTP-8217):

1> Bin = <<"Hello.world.howdy?">>.
<<"Hello.world.howdy?">>
2> binary:split(Bin, <<".">>, [global]).
[<<"Hello">>,<<"world">>,<<"howdy?">>]


回答4:

There is about 15% faster version of binary split working in R12B:

split2(Bin, Chars) ->
    split2(Chars, Bin, 0, []).

split2(Chars, Bin, Idx, Acc) ->
    case Bin of
        <<This:Idx/binary, Char, Tail/binary>> ->
            case lists:member(Char, Chars) of
                false ->
                    split2(Chars, Bin, Idx+1, Acc);
                true ->
                    split2(Chars, Tail, 0, [This|Acc])
            end;
        <<This:Idx/binary>> ->
            lists:reverse(Acc, [This])
    end.

If you are using R11B or older use archaelus version instead.

The above code is faster on std. BEAM bytecode only, not in HiPE, there are both almost same.

EDIT: Note this code obsoleted by new module binary since R14B. Use binary:split(Bin, <<".">>, [global]). instead.



回答5:

Here's one way:

re:split(<<"Hello.world.howdy?">>, "\\.").


标签: erlang binary