How to find “[removed]” tag from the string with J

2020-06-11 13:52发布

I need to validate the incoming string for text <script.

Example:
string a = "This is a simple <script> string";

Now, I need to write a regular expression that will tell me whether this string contains a <script> tag or not.

I ended up writing something like: <* ?script.* ?>

But the challenge is, Incoming string may contain script in following ways,

string a = "This is a simple <script> string";
string a = "This is a simple < script> string";
string a = "This is a simple <javascript></javascript> string";
string a = "This is a simple <script type=text/javascript> string";

Hence the regular expression should check for starting < tag and then it should check for script.

5条回答
Evening l夕情丶
2楼-- · 2020-06-11 14:00

The regex based solution I would recommend is the following:

Regex rMatch = new Regex(@"<script[^>]*>(.*?)</script[^>]*>", RegexOptions.IgnoreCase & RegexOptions.Singleline);
myString = rMatch.Replace(myString, "");

This regex will correctly identify and remove script tags in the following strings:

<script></script>
<script>something...</script>
something...<ScRiPt>something...</scripT>something...
something...<ScRiPt something...="something...">something...</scripT something...>something...

Bonus, it will not match on any of the following invalid script strings:

< script></script>
<javascript>something...</javascript>
查看更多
Anthone
3楼-- · 2020-06-11 14:06
Use:
/<script[\s\S]*?>[\s\S]*?<\/script>/gi

@bodhizero’s accepted answer of <[^>]*script incorrectly returns true under the following conditions:

// Not a proper script tag.
const a = "This is a simple < script> string"; 

// Space added before "img", otherwise the entire tag fails to render here.
const a = "This is a simple < img src='//example.com/script.jpg'> string";

// Picks up "nonsense code" just because a '<' character happens to precede a 'script' string somewhere along the way.
const a = "This is a simple for(i=0;i<5;i++){alert('script')} string";

Here is an excellent resource for building and testing regular expressions.

查看更多
够拽才男人
4楼-- · 2020-06-11 14:22

Try this:

/(<|%3C)script[\s\S]*?(>|%3E)[\s\S]*?(<|%3C)(\/|%2F)script[\s\S]*?(>|%3E)/gi
查看更多
狗以群分
5楼-- · 2020-06-11 14:25

A negated character class comes in handy here.

<[^>]*script
查看更多
Luminary・发光体
6楼-- · 2020-06-11 14:26

I think this one definitely works for me.

var regexp = /<script+.*>+.*<\/script>/g;
查看更多
登录 后发表回答