Matching brackets in a string

2019-01-11 07:42发布

What is the most efficient or elegant method for matching brackets in a string such as:

"f @ g[h[[i[[j[2], k[[1, m[[1, n[2]]]]]]]]]] // z"

for the purpose of identifying and replacing [[ Part ]] brackets with the single character forms?

I want to get:

enter image description here

With everything else intact, such as the prefix @ and postfix // forms intact


An explanation of Mathematica syntax for those unfamiliar:

Functions use single square brackets for arguments: func[1, 2, 3]

Part indexing is done with double square brackets: list[[6]] or with single-character Unicode double brackets: list〚6〛

My intent is to identify the matching [[ ]] form in a string of ASCII text, and replace it with the Unicode characters 〚 〛

9条回答
男人必须洒脱
2楼-- · 2019-01-11 08:06

Ok, here is another answer, a bit shorter:

Clear[replaceDoubleBrackets];
replaceDoubleBrackets[str_String, openSym_String, closeSym_String] := 
Module[{n = 0},
  Apply[StringJoin, 
   Characters[str] /. {"[" :> {"[", ++n}, 
     "]" :> {"]", n--}} //. {left___, {"[", m_}, {"[", mp1_}, 
      middle___, {"]", mp1_}, {"]", m_}, right___} /; 
       mp1 == m + 1 :> {left, openSym, middle, 
        closeSym, right} /. {br : "[" | "]", _Integer} :> br]]

Example:

In[100]:= replaceDoubleBrackets["f[g[h[[i[[j[2], k[[1, m[[1, n[2]]]]]]]]]]]", "(", ")"]

Out[100]= "f[g[h(i(j[2], k(1, m(1, n[2]))))]]"

EDIT

You can also use Mathematica built-in facilities, if you want to replace double brackets specifically with the symbols you indicated:

Clear[replaceDoubleBracketsAlt];
replaceDoubleBracketsAlt[str_String] :=
  StringJoin @@ Cases[ToBoxes@ToExpression[str, InputForm, HoldForm],
     _String, Infinity]

In[117]:= replaceDoubleBracketsAlt["f[g[h[[i[[j[2], k[[1, m[[1, n[2]]]]]]]]]]]"]

Out[117]= f[g[h[[i[[j[2],k[[1,m[[1,n[2]]]]]]]]]]]

The result would not show here properly, but it is a Unicode string with the symbols you requested.

查看更多
地球回转人心会变
3楼-- · 2019-01-11 08:12

Here is my attempt. The pasted ASCII code is pretty unreadable due to the presence of special characters so I first provide a picture of how it looks in MMA.

Basically what it does is this: Opening brackets are always uniquely identifiable as single or double. The problem lies in the closing brackets. Opening brackets always have the pattern string-of-characters-containing-no-brackets + [ or [[. It is impossible to have either a [ following a [[ or vice versa without other characters in-between (at least, not in error-free code).

So, we use this as a hook and start looking for certain pairs of matching brackets, namely the ones that don't have any other brackets in-between. Since we know the type, either "[... ]" or "[[...]]", we can replace the latter ones with the double-bracket symbols and the former one with unused characters (I use smileys). This is done so they won't play a role anymore in the next iteration of the pattern matching process.

We repeat until all brackets are processed and finally the smileys are converted to single brackets again.

You see, the explanation takes mores characters than the code does ;-).

enter image description here

Ascii:

s = "f @ g[hh[[i[[jj[2], k[[1, m[[1, n[2]]]]]]]]]] // z";

myRep[s_String] :=
 StringReplace[s,
  {
   Longest[y : Except["[" | "]"] ..] ~~ "[" ~~ 
     Longest[x : Except["[" | "]"] ..] ~~ "]" :> 
    y <> "\[HappySmiley]" <> x <> "\[SadSmiley]",
   Longest[y : Except["[" | "]"] ..] ~~ "[" ~~ Whitespace ... ~~ "[" ~~
      Longest[x : Except["[" | "]"] ..] ~~ "]" ~~ Whitespace ... ~~ 
     "]" :> y <> "\[LeftDoubleBracket]" <> x <> "\[RightDoubleBracket]"
   }
  ]

StringReplace[FixedPoint[myRep, s], {"\[HappySmiley]" -> "[","\[SadSmiley]" -> "]"}]

Oh, and the Whitespace part is because in Mathematica double brackets need not be next to each other. a[ [1] ] is just as legal as is a[[1]].

查看更多
一纸荒年 Trace。
4楼-- · 2019-01-11 08:12

Here's another one with pattern matching, probably similar to what Sjoerd C. de Vries does, but this one operates on a nested-list structure that is created first, procedurally:

FirstStringPosition[s_String, pat_] :=
    Module[{f = StringPosition[s, pat, 1]},
      If[Length@f > 0, First@First@f, Infinity]
    ];
FirstStringPosition[s_String, ""] = Infinity;

$TokenizeNestedBracePairsBraces = {"[" -> "]", "{" -> "}", "(" -> ")"(*,
  "<"\[Rule]">"*)};
(*nest substrings based on parentheses {([*) (* TODO consider something like http://stackoverflow.com/a/5784082/524504, though non procedural potentially slower*)
TokenizeNestedBracePairs[x_String, closeparen_String] :=
    Module[{opString, cpString, op, cp, result = {}, innerResult,
      rest = x},

      While[rest != "",

        op = FirstStringPosition[rest,
          Keys@$TokenizeNestedBracePairsBraces];
        cp = FirstStringPosition[rest, closeparen];

        Assert[op > 0 && cp > 0];

        Which[
        (*has opening parenthesis*)
          op < cp

          ,(*find next block of [] *)
          result~AppendTo~StringTake[rest, op - 1];
          opString = StringTake[rest, {op}];
          cpString = opString /. $TokenizeNestedBracePairsBraces;
          rest = StringTake[rest, {op + 1, -1}];

          {innerResult, rest} = TokenizeNestedBracePairs[rest, cpString];
          rest = StringDrop[rest, 1];

          result~AppendTo~{opString, innerResult, cpString};

          , cp < Infinity
          ,(*found searched closing parenthesis and no further opening one \
earlier*)
          result~AppendTo~StringTake[rest, cp - 1];
          rest = StringTake[rest, {cp, -1}];
          Return@{result, rest}

          , True
          ,(*done*)
          Return@{result~Append~rest, ""}
        ]
      ]
    ];
(* TODO might want to get rid of empty strings "", { generated here:
TokenizeNestedBracePairs@"f @ g[h[[i[[j[2], k[[1, m[[1, n[2]]]]]]]]]] \
// z"
*)

TokenizeNestedBracePairs[s_String] :=
    First@TokenizeNestedBracePairs[s, ""]

and with these definitions then

StringJoin @@ 
 Flatten[TokenizeNestedBracePairs@
    "f @ g[h[[i[[j[2], k[[1, m[[1, n[2]]]]]]]]]] // z" //. {"[", {"", \
{"[", Longest[x___], "]"}, ""}, "]"} :> {"\[LeftDoubleBracket]", {x}, 
     "\[RightDoubleBracket]"}]

gives

enter image description here

查看更多
登录 后发表回答