I'm trying to make a parser in erlang, in hope to recognize data types inside a string. After searching, I couldnt find any existed problem as mine:
original string:
atom1,"string2,,\"\",",{tuple3, "s pa ces \"", {[test]},"_#",test},<<"binary4\",,>>">>, #{map5=>5, element=>{e1,e2}}, #record6{r1 = 1, r2 = 2} , <<300:16>>
string that is need to be parsed:
"atom1,\"string2,,\\\"\\\",\",{tuple3, \"s pa ces \\\"\", {[test]},\"_#\",test},<<\"binary4\\\",,>>\">>, #{map5=>5, element=>{e1,e2}}, #record6{r1 = 1, r2 = 2} , <<300:16>>"
Expected Ouput:
+ number of params: 7 + value ------> type" - atom1 ------> Atom - "string2,,\"\"," ------> String - {tuple3, "s pa ces \"", {[test]},"_#",test} ------> Tuple - <<"binary4\",,>>">> ------> Binary - #{map5=>5, element=>{e1,e2}} ------> Map - #record6{r1 = 1, r2 = 2} ------> Record - <<300:16>> ------> Binary
But my current code doesnt work as expected, here it is:
comma_parser(Params) ->
{ok, R} = re:compile("(\".*?\"|[^\",\\s]+)(?=\\s*,|\\s*$)"),
{match, Matches} = re:run(Params, R, [{capture, [1], list}, global]),
?DEBUG("truonggv1 - comma_parser: Matches: ~p~n", [Matches]),
[M || [M] <- Matches].
Current Output:
+ number of params: 14
+ value ------> type
- atom1 ------> Atom
- "string2,,\"\" ------> String
- ",{tuple3, "s pa ces \"" ------> String
- {[test]} ------> Tuple
- "_#" ------> String
- test} ------> Atom
- "binary4\" ------> String
- >> ------> Atom
- #{map5=>5 ------> Map
- element=>{e1 ------> Atom
- e2}} ------> Atom
- 1 ------> Atom
- 2} ------> Atom
- <<300:16>> ------> Binary
Does anyone know how to correct this please ?
update my codes with Params is the "string that is need to be parsed" that I have noted above:
check_params_by_comma(Params) ->
case string:str(Params, ",") of
0 ->
Result = Params;
1 ->
Result = "param starts with character ',' ~n";
_Comma_Pos ->
Parse_String = comma_parser(Params),
Result = "number of params: " ++ integer_to_list(length(Parse_String))
++ "\n\n\r\t value ------> type \n\r"
++ "\t*********************\n\r"
++ ["\t" ++ X ++ " ------> " ++ check_type(X) ++ "\n\r"|| X <- Parse_String]
end,
Result.
check_type(X) ->
Binary = string:str(X, "<<"),
String = string:str(X, "\""),
Tuple = string:str(X, "{"),
List = string:str(X, "["),
Map = string:str(X, "#{"),
case X of
_ when 1 == Binary -> "Binary";
_ when 1 == String -> "String";
_ when 1 == Tuple -> "Tuple";
_ when 1 == List -> "List";
_ when 1 == Map -> "Map";
_ -> "Atom"
end.
comma_parser(Params) ->
{ok, R} = re:compile("(\".*?\"|[^\",\\s]+)(?=\\s*,|\\s*$)"),
{match, Matches} = re:run(Params, R, [{capture, [1], list}, global]),
[M || [M] <- Matches].