I'm terrible with regex, but I've had a try and a Google (and even looked in reddit's source) and I'm still stuck so here goes:
My aim is to match the following 'codes' and replace them with the HTML tags. It's just the regex I'm stuck with.
**bold text**
_italic text_
~hyperlink~
Here's my attempts at the bold one:
^\*\*([.^\*]+)\*\*$
Why this isn't working? I'm using the preg syntax.
Here is another regexp:
\*\*((?:[^*]|\*(?!\*))*)\*\*
Example in Perl:
It prints:
Or as html:
Stackoverflow's interpretation is:
before bold and italic *text 2nd line after just italic
tag soup as a result
use:
explanation:
in a character class "[ ]" "^" is only significant if it's the first character. so
(.*)
matches anything,(.[^*]*)
is match anything until literal *edit: in response to comments to match asterisk within (ie
**bold *text**
), you'd have to use a non greedy match:character classes are more efficient non greedy matches, but it's not possible to group within a character class (see "Parentheses and Backreferences...")
that will work for the bold text.
just replace the ** with _ or ~ for the others
First of all, get rid of the ^ and the $. Using those will only match a string that starts with ** and ends with **. Second, use the greedy quantifier to match as little text as possible, instead of making a character class for all characters other than asterisks.
Here's what I suggest: