How to fix this regex so it replaces * properly (b

2019-02-25 17:40发布

问题:

I'm practicing regex. I thought of creating regex that turn * into <em>, just like with Markdown:

el = el.replace(/\*\b/g, '<em>')
el = el.replace(/\b\*|(\.|\,|\?|\!|\*|---|\.\.\.\s)\*/g, '$1</em>')

This works in most cases. However, things gets messy if you apply that regex to this:

Chicken teriy*ai*ki, r*ai*men noodles, spaghetti a la moneg*ai*sque.

It produces this:

Chicken teriy<em>ai<em>ki, r<em>ai<em>men noodles, spaghetti a la moneg<em>ai<em>sque. And wait for me, often falling asleep.</em></em></em></em></em></em>

How to modify this regex so it produces something like this:

 Chicken teriy<em>ai</em>ki, r<em>ai</em>men noodles, spaghetti a la moneg<em>ai</em>sque. And wait for me, often falling asleep.

回答1:

You can merge the two branches in your second regex since both end with \* pattern, like (\b|\.|,|\?|!|\*|---|\.{3}\s)\* (you may even merge the \.|,|\?|!|\* single char alternatives into [.,?!*]), and then use

var s = "Chicken teriy*ai*ki, r*ai*men noodles, spaghetti a la moneg*ai*sque.";
console.log(
  s.replace(/\*\b([^]*?)(\b|[.,?!*]|---|\.{3}\s)\*/g, '<em>$1$2</em>') 
)

Details

  • \*\b - a * that is followed with a word char (letter, digit or _)
  • ([^]*?) - Group 1: any 0+ chars, as few as possible (may be replaced with a [\s\S] / [\d\D] / [\w\W] if you need more portability), up to the leftmost occurrence of
  • (\b|[.,?!*]|---|\.{3}\s) - word boundary, ., ,, ?, !, *, ---, ... + whitespace
  • \* - a * char.


回答2:

This should work, it will wrap the characters between * signs into em tags, NOTE: this applies globally on the string provided.

 const str = "something that has words surrounded with * signs"
 str.replace(/\*(\w+)\*/g, "<em>$1</em>")


回答3:

Use regex \*([\w ^?.]*?)\*

Replace with <em>$1<\em>

el = el.replace(/\*([\w ^?.]*?)\*/g, '<em>$1<\em>')

Regex