What's the best way to remove
tags from t

2019-02-22 02:40发布

The .NET web system I'm working on allows the end user to input HTML formatted text in some situations. In some of those places, we want to leave all the tags, but strip off any trailing break tags (but leave any breaks inside the body of the text.)

What's the best way to do this? (I can think of ways to do this, but I'm sure they're not the best.)

7条回答
Bombasti
2楼-- · 2019-02-22 03:20

You could also try (if the markup is likely to be a valid tree) something similar to:

string s = "<markup><div>Text</div><br /><br /></markup>";

XmlDocument doc = new XmlDocument();
doc.LoadXml(s);

Console.WriteLine(doc.InnerXml);

XmlElement markup = doc["markup"];
int childCount = markup.ChildNodes.Count;
for (int i = childCount -1; i >= 0; i--)
{
    if (markup.ChildNodes[i].Name.ToLower() == "br")
    {
        markup.RemoveChild(markup.ChildNodes[i]);
    }
    else
    {
        break;
    }
}
Console.WriteLine("---");
Console.WriteLine(markup.InnerXml); 
Console.ReadKey();

The code above is a bit "scratch-pad" but if you cut and paste it into a Console application and run it, it does work :=)

查看更多
登录 后发表回答