Split string and keep the separator

2020-05-09 09:21发布

I'm writing a chrome extension, and I need to split a string that contains only text and img tags, so that every element of the array is either letter or img tag. For example, "a", "b", "c", "<img.../>", "d". I've found a way to do this: str.split(/(<img.*?>|)/), however, some elements of the resulting array are empty (I don't know why). Are there any other suitable regexes?

Thank you very much for your help.

2条回答
混吃等死
2楼-- · 2020-05-09 09:52

You can use exec instead of split to obtain the separated elements:

var str = 'abc<img src="jkhjhk" />d';
var myRe = /<img[^>]*>|[a-z]/gi;
var match;
var res= new Array();

while ((match = myRe.exec(str)) !== null) {
    res.push(match[0]);
}
console.log(res);
查看更多
迷人小祖宗
3楼-- · 2020-05-09 09:55

The reason you get empty elements is the same why you get <img...> inyour results. When you use capturing parentheses in a split pattern, the result will contain the captures in the places where the delimiters were found. Since you have (<img.*?>|), you match (and capture) an empty string if the second alternative is used. Unfortunately, (<img.*?>)| alone doesn't help, because you'll still get undefined instead of empty strings. However, you can easily filter those out:

str.split(/(<img[^>]*>)|/).filter(function(el) { return el !== undefined; });

This will still get you empty elements at the beginning and the end of the string as well as between adjacent <img> tags, though. So splitting <img><img> would result in

["", "<img>", "", "<img>", ""]

If you don't want that, the filter function becomes even simpler:

str.split(/(<img[^>]*>)|/).filter(function(el) { return el; });
查看更多
登录 后发表回答