I have a code`
string tag = "div";
string pattern = string.Format(@"\<{0}.*?\>(?<tegData>.+?)\<\/{0}\>", tag.Trim());
Regex regex = new Regex(pattern, RegexOptions.ExplicitCapture);
MatchCollection matches = regex.Matches(data);
`
and i need to get content between <div class="in"> .... </div>
tags
<div class="in">
<a href="/a/show/7184569" class="mm">ВАЗ 2121</a> <span class="for">за</span> <span class="price">2 700 $</span></span><br/><span class="year">1990 г.</span><br/><div style="margin: 3px 0 3px 0">1.6 л, бензин, КПП механика, с пробегом, белый, литые диски, тонировка, спойлер, ветровики, противотуманки, Движок после капитального ремонта!</div><div>
<span style="display:block; padding: 4px 0 0 0;"><span class="region">Костанай</span><span class="adv-phones">, +7 (777) 4464451</span></span>
<small class="gray air">24 просмотра</small>
<small class="gray air">13 июня</small>
</div>
<div class="selectItem" title="Выбрать" id="fv_sic_7184569">
<a href="#" class="fav-button" id="fav_7184569"> </a> </div>
</div>
How can I do it? My code doesn't work.
It much easier for me to use XPath. Maybe you will find it useful.
Code that worked for me for RegEx would find the first inner div.
Here's a regex that might extract simple div tags:
However, using RegEx for HTML parsing is almost always inappropriate and guaranteed to not work properly. That is simply because markup languages such as HTML are not regular languages.
That being said you would be much better off using an XML parser to parse the document or fragment and then extract what you need. In fact, using a forward-only parser would probably even be faster than trying to use RegEx.
You should look at the XmlReader class in .NET.
To get nested tags try use this function:
Parameters are simple regexes to filter the target tag, here are examples:
This variant handles opening and closing tags and nested tags of the same type (other nested tags can be broken and ignored).
The other variant checks nested tags more strict and does not match if some of them are mis-opened or closed:
If it doesn't have to be Server Side you could use some JavaScript to make this happen. Such as: