How to “exclude” a blank space from RegExp groupin

2019-12-16 19:56发布

问题:

Context

I am a front-end developer working on ExtJS framework, and for purpose of speed, I created a lot of Sublime Text 3's Snippets to agilize the work.

The models of a task, comes from the back-end at C# platform, containing the type of var and the name of it.

Then I got an idea to simply copy the model content and with its string produce a new string that equates to ExtJS model pattern.

Inside the Snippet code, I am not using any programming language (because its not possible), I am only producing output string with regex-only solution, that is the only one thing I can do due to Sublime Text snippet limitations.

Codes

A sample of one line of C# model code is:

  public string Email { get; set; }

All the lines of the model follow that pattern.

At the moment, my Sublime Text 3 Snippet has the code:

<snippet>
    <content><![CDATA[
    { name: '${SELECTION/(        public )|(public )|({ get; set; })|(\w)|( \w+)|( )/(?5$5/\s/)/g}', type: '${SELECTION/(        public )|(public )|({ get; set; })|(\ \$w\ \w.)|( \w+)|( )/(?1)(?2)(?3)(?4\$5\$6\)($5)($6)/g}' },
]]></content>
    <tabTrigger>modelnames</tabTrigger>
</snippet>

PS: the ${SELECTION} var is a snippet var that gets the string that was selected when you triggered the snippet. The string that results using my snippet with that sample string selected is:

  { type: 'string', name: ' Email' },

Problem

As you can see above, I got almost the perfect result, but my problem is that blank space before Email.

I have tried millions of different combinations and so, but I am beginner on regex and cant solve that.

I think that creating a group that match a space and excluding it from the main group target string solves the problem, but I don't know how to do that exactly, the truth is that regex that I've generated was on trail-and-error method, because I am a beginner at regex.

I'm asking a help to remove that space only, probably its a simple task to a RegExp expert.

回答1:

You could use:

<snippet>
    <content><![CDATA[
    { ${SELECTION/\s*\bpublic\s+([\w<>]+)\s+(\w+).*/'name': '$2', 'type': '$1'/} },
]]></content>
    <tabTrigger>modelnames</tabTrigger>
</snippet>

which would output

    { 'name': 'Email', 'type': 'string' },

for your given example

        public string Email { get; set; }

How it works:

  • \s*\bpublic - match any number of whitespace characters, followed by a word boundary, followed by public
  • \s+ match at least one whitespace character
  • ([\w<>]+) match at least one word character or angle bracket (to support generic types, in case it's useful) and store the result into capture group 1
  • \s+ match at least one whitespace character
  • (\w+) match at least one word character (the identifier) and store the result into capture group 2
  • .* match the rest of the selection
  • / begin replacement
  • replace with 'name': '$2', 'type': '$1', where $1 will be filled from capture group 1, and $2 from capture group 2
  • / end replacement

I didn't include the global flag because this regex only needs to match once per line/selection.

Actually, we could make it replace all such lines in your selection:

<snippet>
    <content><![CDATA[
${SELECTION/\s*\bpublic\s+([\w<>]+)\s+(\w+).*?(\n|$)/    \{ 'name': '$2', 'type': '$1' \},\n/g}
]]></content>
    <tabTrigger>modelnames</tabTrigger>
</snippet>

Would convert

        public string Email { get; set; }
        public string Name { get; set; }

to

    { 'name': 'Email', 'type': 'string' },
    { 'name': 'Name', 'type': 'string' },

EDIT: based on feedback in the comments about mapping types, here is the new snippet content:

<snippet>
    <content><![CDATA[
${SELECTION/\s*\bpublic\s+(?:(DateTime)|(bool)|(decimal)|([\w<>]+))\s+(?<name>\w+).*?(\n|$)/    \{ name: '$+{name}', type: '(?1date)(?2boolean)(?3float)(?4$4)' \},\n/g}
]]></content>
    <tabTrigger>modelnames</tabTrigger>
</snippet>

The string replacement format is documented at http://www.boost.org/doc/libs/1_51_0/libs/regex/doc/html/boost_regex/format/boost_format_syntax.html.

Here, I'm using a named capture group, called name, for simplicity.