Erlang: split binary on each char

I wrote a function that works to split the binary into each char, but I have a feeling there is an easier way to do this:

my_binary_to_list(<<H,T/binary>>) ->
%slightly modified version of http://erlang.org/doc/efficiency_guide/binaryhandling.html
    [list_to_binary([H])|my_binary_to_list(T)];
my_binary_to_list(<<>>) -> [].

> my_binary_to_list(<<"ABC">>).
[<<"A">>,<<"B">>,<<"C">>]

      

I think this is probably messy due to list_to_binary([H])

because it H

should already be binary.

I tried to use this related function directly, but got "AA"

what I didn't want. Then I tried simple [H]

and got ["A","B","C"]

, which is also not what I wanted.

+3


source to share


2 answers


You can create a single byte binary without creating a list and call list_to_binary

like this:

my_binary_to_list(<<H,T/binary>>) ->
    [<<H>>|my_binary_to_list(T)];

      


You can also use binary representations here to do the same logic as above on a single line:

1> [<<X>> || <<X>> <= <<"ABC">>].
[<<"A">>,<<"B">>,<<"C">>]

      



You can also directly extract binaries of size 1 (this is probably not faster than the above):

2> [X || <<X:1/binary>> <= <<"ABC">>].
[<<"A">>,<<"B">>,<<"C">>]

      

Edit: A fast scanner using timer:tc/1

runs the second code about half the time compared to the first, but you should do the comparison yourself before using one for performance reasons. Maybe the second one is sharing a large binary, creating binaries?

1> Bin = binary:copy(<<"?">>, 1000000).
<<"????????????????????????????????????????????????????????????????????????????????????????????????????????????????????"...>>
2> timer:tc(fun() -> [<<X>> || <<X>> <= Bin] end).
{14345634,
 [<<"?">>,<<"?">>,<<"?">>,<<"?">>,<<"?">>,<<"?">>,<<"?">>,
  <<"?">>,<<"?">>,<<"?">>,<<"?">>,<<"?">>,<<"?">>,<<"?">>,
  <<"?">>,<<"?">>,<<"?">>,<<"?">>,<<"?">>,<<"?">>,<<"?">>,
  <<"?">>,<<"?">>,<<"?">>,<<"?">>,<<"?">>,<<...>>|...]}
3> timer:tc(fun() -> [X || <<X:1/binary>> <= Bin] end).
{7374003,
 [<<"?">>,<<"?">>,<<"?">>,<<"?">>,<<"?">>,<<"?">>,<<"?">>,
  <<"?">>,<<"?">>,<<"?">>,<<"?">>,<<"?">>,<<"?">>,<<"?">>,
  <<"?">>,<<"?">>,<<"?">>,<<"?">>,<<"?">>,<<"?">>,<<"?">>,
  <<"?">>,<<"?">>,<<"?">>,<<"?">>,<<"?">>,<<...>>|...]}

      

+8


source


You can use a list comprehension with a bit string generator ( <=

consumes binaries as opposed to <-

which consumes lists):

> [<<A>> || <<A>> <= <<"foo">>].
[<<"f">>,<<"o">>,<<"o">>]

      




In your version list_to_binary([H])

can be replaced with <<H>>

- both generate a binary file containing one byte. Whether using a list comprehension instead of a recursive function as "easier" may be a matter of taste.

+3


source







All Articles