Can you ignore HTML in a string while doing a Repl

Possible Duplicate:
Replace words in a string, but ignore HTML

Is it possible to ignore the HTML elements when calling Replace?

Sample code:

$myText.replace(new RegExp( $searchString, 'gi' ), 
    '<span class="highlight">'+ $searchString + '</span>');

$myText is a large string of HTML e.g.:

var $myText = 
    "<p>Lorem Ipsum is simply dummy text of the printing and typesetting " +
    "industry. Lorem Ipsum has been the industry's standard dummy text " +
    "ever since the 1500s, <img src="something">when an unknown printer " +
    "took a galley of type and scrambled it to make a type specimen book. " +
    "It has survived not only five centuries, " +
    "<a href="#" title="Lorem">but</a> also the leap into electronic " +
    "typesetting, remaining essentially unchanged. It was popularised in " +
    "the 1960s with the release of Letraset sheets containing Lorem Ipsum " +
    "passages, and more recently with desktop publishing software like " +
    "Aldus PageMaker including versions of Lorem Ipsum.</p>"

$searchString is equal to whatever a user types into an text input box.

If yes, how would I do it given the above sample code?

回答1:

Yes, take a look at the following forum post:

http://forums.asp.net/t/1443955.aspx

The RegEx pattern you are looking for would be something similar to the following:

"(?<!<[^>]*)Jon Doe(?<![^>]*<)"

Basically, you're doing a search and replace on anything that lives outside brackets <>.

JavaScript:

phrase = phrase.replace(/"(?<!<[^>]*)Jon Doe(?<![^>]*<)" /i, "is not");

回答2:

Parsing HTML with a regular expression? This isn't a good idea. I would suggest inserting the HTML into the DOM and then traversing the nodes.