How do you URL encode parameters in Erlang?

2020-02-26 04:38发布

问题:

I'm using httpc:request to post some data to a remote service. I have the post working but the data in the body() of the post comes through as is, without any URL-encoding which causes the post to fail when parsed by the remote service.

Is there a function in Erlang that is similar to CGI.escape in Ruby for this purpose?

回答1:

You can find here the YAWS url_encode and url_decode routines

They are fairly straightforward, although comments indicate the encode is not 100% complete for all punctuation characters.



回答2:

I encountered the lack of this feature in the HTTP modules as well.

It turns out that this functionality is actually available in the erlang distribution, you just gotta look hard enough.

> edoc_lib:escape_uri("luca+more@here.com").
"luca%2bmore%40here.com"

This behaves like CGI.escape in Ruby, there is also URI.escape which behaves slightly differently:

> CGI.escape("luca+more@here.com")
 => "luca%2Bmore%40here.com" 
> URI.escape("luca+more@here.com")
 => "luca+more@here.com" 

edoc_lib



回答3:

At least in R15 there is http_uri:encode/1 which does the job. I would also not recommend using edoc_lib:escape_uri as its translating an '=' to a %3d instead of a %3D which caused me some trouble.



回答4:

Here's a simple function that does the job. It's designed to work directly with inets httpc.

%% @doc A function to URL encode form data.
%% @spec url_encode(formdata()).

-spec(url_encode(formdata()) -> string()).
url_encode(Data) ->
    url_encode(Data,"").

url_encode([],Acc) ->
    Acc;

url_encode([{Key,Value}|R],"") ->
    url_encode(R, edoc_lib:escape_uri(Key) ++ "=" ++ edoc_lib:escape_uri(Value));
url_encode([{Key,Value}|R],Acc) ->
    url_encode(R, Acc ++ "&" ++ edoc_lib:escape_uri(Key) ++ "=" ++ edoc_lib:escape_uri(Value)).

Example usage:

httpc:request(post, {"http://localhost:3000/foo", [], 
                    "application/x-www-form-urlencoded",
                    url_encode([{"username", "bob"}, {"password", "123456"}])}
             ,[],[]).


回答5:

If someone need encode uri that works with utf-8 in erlang:

https://gist.github.com/3796470

Ex.

Eshell V5.9.1  (abort with ^G)

1> c(encode_uri_rfc3986).
{ok,encode_uri_rfc3986}

2> encode_uri_rfc3986:encode("テスト").
"%e3%83%86%e3%82%b9%e3%83%88"

3> edoc_lib:escape_uri("テスト").
"%c3%86%c2%b9%c3%88" # output wrong: ƹÈ


回答6:

To answer my own question...I found this lib in ibrowse!

http://www.erlware.org/lib/5.6.3/ibrowse-1.4/ibrowse_lib.html#url_encode-1

url_encode/1

url_encode(Str) -> UrlEncodedStr

Str = string()
UrlEncodedStr = string()

URL-encodes a string based on RFC 1738. Returns a flat list.

I guess I can use this to do the encoding and still use http:



回答7:

Here's a "fork" of the edoc_lib:escape_uri function that improves on the UTF-8 support and also supports binaries.

escape_uri(S) when is_list(S) ->
    escape_uri(unicode:characters_to_binary(S));
escape_uri(<<C:8, Cs/binary>>) when C >= $a, C =< $z ->
    [C] ++ escape_uri(Cs);
escape_uri(<<C:8, Cs/binary>>) when C >= $A, C =< $Z ->
    [C] ++ escape_uri(Cs);
escape_uri(<<C:8, Cs/binary>>) when C >= $0, C =< $9 ->
    [C] ++ escape_uri(Cs);
escape_uri(<<C:8, Cs/binary>>) when C == $. ->
    [C] ++ escape_uri(Cs);
escape_uri(<<C:8, Cs/binary>>) when C == $- ->
    [C] ++ escape_uri(Cs);
escape_uri(<<C:8, Cs/binary>>) when C == $_ ->
    [C] ++ escape_uri(Cs);
escape_uri(<<C:8, Cs/binary>>) ->
    escape_byte(C) ++ escape_uri(Cs);
escape_uri(<<>>) ->
    "".

escape_byte(C) ->
    "%" ++ hex_octet(C).

hex_octet(N) when N =< 9 ->
    [$0 + N];
hex_octet(N) when N > 15 ->
    hex_octet(N bsr 4) ++ hex_octet(N band 15);
hex_octet(N) ->
    [N - 10 + $a].

Note that, because of the use of unicode:characters_to_binary it'll only work in R13 or newer.

Example usage is:

9> httpc:request("http://httpbin.org/get?q=" ++ mylib_app:escape_uri("☺")).
{ok,{{"HTTP/1.1",200,"OK"},
     [{"connection","keep-alive"},
      {"date","Sat, 09 Nov 2019 21:51:54 GMT"},
      {"server","nginx"},
      {"content-length","178"},
      {"content-type","application/json"},
      {"access-control-allow-credentials","true"},
      {"access-control-allow-origin","*"},
      {"referrer-policy","no-referrer-when-downgrade"},
      {"x-content-type-options","nosniff"},
      {"x-frame-options","DENY"},
      {"x-xss-protection","1; mode=block"}],
     "{\n  \"args\": {\n    \"q\": \"\\u263a\"\n  }, \n  \"headers\": {\n    \"Host\": \"httpbin.org\"\n  }, \n  \"origin\": \"11.111.111.111, 11.111.111.111\", \n  \"url\": \"https://httpbin.org/get?q=\\u263a\"\n}\n"}}

We send out a request with escaped query parameter and see that we get back the correct Unicode codepoint.



回答8:

AFAIK there's no URL encoder in the standard libraries. Think I 'borrowed' the following code from YAWS or maybe one of the other Erlang web servers:

% Utility function to convert a 'form' of name-value pairs into a URL encoded
% content string.

urlencode(Form) ->
    RevPairs = lists:foldl(fun({K,V},Acc) -> [[quote_plus(K),$=,quote_plus(V)] | Acc] end, [],Form),
    lists:flatten(revjoin(RevPairs,$&,[])).

quote_plus(Atom) when is_atom(Atom) ->
    quote_plus(atom_to_list(Atom));

quote_plus(Int) when is_integer(Int) ->
    quote_plus(integer_to_list(Int));

quote_plus(String) ->
    quote_plus(String, []).

quote_plus([], Acc) ->
    lists:reverse(Acc);

quote_plus([C | Rest], Acc) when ?QS_SAFE(C) ->
    quote_plus(Rest, [C | Acc]);

quote_plus([$\s | Rest], Acc) ->
    quote_plus(Rest, [$+ | Acc]);

quote_plus([C | Rest], Acc) ->
    <<Hi:4, Lo:4>> = <<C>>,
    quote_plus(Rest, [hexdigit(Lo), hexdigit(Hi), ?PERCENT | Acc]).

revjoin([], _Separator, Acc) ->
    Acc;

revjoin([S | Rest],Separator,[]) ->
    revjoin(Rest,Separator,[S]);

revjoin([S | Rest],Separator,Acc) ->
    revjoin(Rest,Separator,[S,Separator | Acc]).

hexdigit(C) when C < 10 -> $0 + C;
hexdigit(C) when C < 16 -> $A + (C - 10).