Remove HTML formatting in Razor MVC 3

I am using MVC 3 and Razor View engine.

What I am trying to do

I am making a blog using MVC 3, I want to remove all HTML formatting tags like <p> <b> <i> etc..

For which I am using the following code. (it does work)

 @{
 post.PostContent = post.PostContent.Replace("<p>", " ");   
 post.PostContent = post.PostContent.Replace("</p>", " ");
 post.PostContent = post.PostContent.Replace("<b>", " ");
 post.PostContent = post.PostContent.Replace("</b>", " ");
 post.PostContent = post.PostContent.Replace("<i>", " ");
 post.PostContent = post.PostContent.Replace("</i>", " ");
 }

I feel that there definitely has to be a better way to do this. Can anyone please guide me on this.

标签： asp.net-mvc-3 razor

4条回答

爷、活的狠高调

2楼-- · 2019-04-29 02:08

Thanks Alex Yaroshevich,

Here is what I use now..

post.PostContent = Regex.Replace(post.PostContent, @"<[^>]*>", String.Empty);

0人赞添加讨论(0) 举报

smile是对你的礼貌

3楼-- · 2019-04-29 02:10

The regular expression is slow. use this, it's faster:

public static string StripHtmlTagByCharArray(string htmlString)
{
    char[] array = new char[htmlString.Length];
    int arrayIndex = 0;
    bool inside = false;

    for (int i = 0; i < htmlString.Length; i++)
    {
        char let = htmlString[i];
        if (let == '<')
        {
            inside = true;
            continue;
        }
        if (let == '>')
        {
            inside = false;
            continue;
        }
        if (!inside)
        {
            array[arrayIndex] = let;
            arrayIndex++;
        }
    }
    return new string(array, 0, arrayIndex);
}

You can take a look at http://www.dotnetperls.com/remove-html-tags

0人赞添加讨论(0) 举报

一纸荒年 Trace。

4楼-- · 2019-04-29 02:12

You can use regular expression.

This article might help you.

0人赞添加讨论(0) 举报

别忘想泡老子

5楼-- · 2019-04-29 02:22

Just in case you want to use regex in .NET to strip the HTML tags, the following seems to work pretty well on the source code for this very page. It's better than some of the other answers on this page because it looks for actual HTML tags instead of blindly removing everything between < and >. Back in the BBS days, we typed <grin> a lot instead of :), so removing <grin> is not an option. :)

This solution only removes the tags. It does not remove the contents of those tags in situations where that might be important -- a script tag, for example. You'd see the script, but the script wouldn't execute because the script tag itself gets removed. Removing the contents of an HTML tag is VERY tricky, and practically requires that the HTML fragment be well formed...

Also note the RegexOption.Singleline option. That's very important for any block of HTML. as there's nothing wrong with opening an HTML tag on one line and closing it in another.

string strRegex = @"</{0,1}(!DOCTYPE|a|abbr|acronym|address|applet|area|article|aside|audio|b|base|basefont|bdi|bdo|big|blockquote|body|br|button|canvas|caption|center|cite|code|col|colgroup|datalist|dd|del|details|dfn|dialog|dir|div|dl|dt|em|embed|fieldset|figcaption|figure|font|footer|form|frame|frameset|h1|h2|h3|h4|h5|h6|head|header|hr|html|i|iframe|img|input|ins|kbd|keygen|label|legend|li|link|main|map|mark|menu|menuitem|meta|meter|nav|noframes|noscript|object|ol|optgroup|option|output|p|param|pre|progress|q|rp|rt|ruby|s|samp|script|section|select|small|source|span|strike|strong|style|sub|summary|sup|table|tbody|td|textarea|tfoot|th|thead|time|title|tr|track|tt|u|ul|var|video|wbr){1}(\s*/{0,1}>|\s+.*?/{0,1}>)";
Regex myRegex = new Regex(strRegex, RegexOptions.Singleline);
string strTargetString = @"<p>Hello, World</p>";
string strReplace = @"";

return myRegex.Replace(strTargetString, strReplace);

I'm not saying this is the best answer. It's just an option and it worked great for me.

0人赞添加讨论(0) 举报

Remove HTML formatting in Razor MVC 3

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间