Regex to find and replace emoji names within colon

2019-06-24 03:36发布

问题:

I'm trying to write a regex (for JavaScript's regex engine) that I can use to do a find and replace in text for emoji names within colons. Like in Slack or Discord when you type :smiley-face: and it replaces it when you submit the chat. I'm targeting text nodes only so I don't need to worry about other html inside the text.

Is it possible to write a regex that could match all of the following rules? (text highlighted with monospace blocks = regex positive matches)

:any-non-whitespace:
:text1:sample2:
:@(1@#$@SD: :s:
:nospace::inbetween: because there are 2 colons in the middle
:nospace:middle:nospace:

I'm starting with something like this but it's incomplete

/:(?!:)\S+:/gim

I'm trying to think of all the special cases that might possibly occur doing this. Maybe I'm overthinking it.

There's a lot of Twitch emotes involved so I can't use emoji unicode characters. The regex will find matches and replace with tags

回答1:

I suggest using

:[^:\s]*(?:::[^:\s]*)*:

See the regex demo. It is the same pattern as :(?:[^:\s]|::)*:, but a bit more efficient because the (?:..|...)* part is unrolled.

Details

  • : - a colon
  • [^:\s]* - 0+ chars other than : and whitespace
  • (?: - start of a quantified non-capturing group:
    • :: - double colon
    • [^:\s]* - 0+ chars other than : and whitespace
  • )* - end of grouping, repeated 0 or more times (due to the * quantifier)
  • : - a colon.


回答2:

Do you want something like this regex?

(:(?![\n])[()#$@-\w]+:)

Demo,,, in which you can additionally insert unallowed characters into the character class of the (?![\n]) and also additonally insert allowed characters into the character class [()#$@-\w]



回答3:

My first thought was

:(::|[^:\n])+:

It matches a string, at least one character long, including surrounding colons, that consists of either

  • two colons (::), or
  • a character that isn't a colon, nor a line feed.

But that's basically what Wiktor had as a (slower) alternative (comments). But I'll leave it here anyway since it's working, as opposed to the other submitted answers ;)

See it here at regex101.



回答4:

Try this regx

/(^|\s)+:([^\s\n\r])+:|^:[^\s\n\r]+/g