Traverse the dom with CsQuery

2019-05-18 13:21发布

问题:

I'm trying to learn how to use CsQuery to traverse a dom to get specific text.

The html looks like this:

<div class="featured-rows">
  <div class="row">
    <div class="featured odd" data-genres-filter="MA0000002613">
      <div class="album-cover">
      <div class="artist">
        <a href="http://www.allmusic.com/artist/half-japanese-0000555654">Half apanese</a>
      </div>
      <div class="title">
      <div class="label"> Joyful Noise </div>
      <div class="styles">
      <div class="rating allmusic">
      <div class="rating average">
      <div class="headline-review">
    </div>
    <div class="featured even" data-genres-filter="MA0000002572, MA0000002613">
    </div>
  <div class="row">
  <div class="row">
  <div class="row">

My code attempt looks like this:

public void GetRows()
        {
            var artistName = string.Empty;
            var html = GetHtml("http://www.allmusic.com/newreleases");            
            var rows = html.Select(".featured-rows");
            foreach(var row in rows)
            {     
                var odd = row.Cq().Find(".featured odd");
                foreach(var artist in odd)
                {
                    artistName = artist.Cq().Text();
                }
            }
        }

The first select for .featured-row works but then i don't know how to get down to the .artist to get the text.

回答1:

List<string> artists = html[".featured .artist a"].Select(dom=>dom.TextContent).ToList();

where html == your CQ object.

var odd = row.Cq().Find(".featured odd");

should be

var odd = row.Cq().Find(".featured.odd");


回答2:

You should try something similar to this:

var html = GetHtml("http://www.allmusic.com/newreleases");
var query = CQ.Create(html)
var row = query[".artist>a"];
string link = row.Attributes["href"];
string text = row.DefaultValue or row.InnerText or row.Value...

CsQuery is port of JQuery so you can google for JQuery code

UPDATE: To traverse to get all artists and titles

var rows = query[".featured odd"];
foreach(var row in rows)
{
  var artistsLink = row[".artists>a"];
  var title = row[".title"];
 // here do whatever you need with this
}


标签: c# csquery