I'm practicing regex. I thought of creating regex that turn *
into <em>
, just like with Markdown:
el = el.replace(/\*\b/g, '<em>')
el = el.replace(/\b\*|(\.|\,|\?|\!|\*|---|\.\.\.\s)\*/g, '$1</em>')
This works in most cases. However, things gets messy if you apply that regex to this:
Chicken teriy*ai*ki, r*ai*men noodles, spaghetti a la moneg*ai*sque.
It produces this:
Chicken teriy<em>ai<em>ki, r<em>ai<em>men noodles, spaghetti a la moneg<em>ai<em>sque. And wait for me, often falling asleep.</em></em></em></em></em></em>
How to modify this regex so it produces something like this:
Chicken teriy<em>ai</em>ki, r<em>ai</em>men noodles, spaghetti a la moneg<em>ai</em>sque. And wait for me, often falling asleep.
You can merge the two branches in your second regex since both end with \*
pattern, like (\b|\.|,|\?|!|\*|---|\.{3}\s)\*
(you may even merge the \.|,|\?|!|\*
single char alternatives into [.,?!*]
), and then use
var s = "Chicken teriy*ai*ki, r*ai*men noodles, spaghetti a la moneg*ai*sque.";
console.log(
s.replace(/\*\b([^]*?)(\b|[.,?!*]|---|\.{3}\s)\*/g, '<em>$1$2</em>')
)
Details
\*\b
- a *
that is followed with a word char (letter, digit or _
)
([^]*?)
- Group 1: any 0+ chars, as few as possible (may be replaced with a [\s\S]
/ [\d\D]
/ [\w\W]
if you need more portability), up to the leftmost occurrence of
(\b|[.,?!*]|---|\.{3}\s)
- word boundary, .
, ,
, ?
, !
, *
, ---
, ...
+ whitespace
\*
- a *
char.
This should work, it will wrap the characters between * signs into em tags, NOTE: this applies globally on the string provided.
const str = "something that has words surrounded with * signs"
str.replace(/\*(\w+)\*/g, "<em>$1</em>")
Use regex \*([\w ^?.]*?)\*
Replace with <em>$1<\em>
el = el.replace(/\*([\w ^?.]*?)\*/g, '<em>$1<\em>')
Regex