I have to parse an html page organized this way:
<li id="list">
<a id="cities">Cities</a>
<ul>
<li>
<a class="region" title="liguria">Liguria</a>
<ul>
<li>
<a class="genova">Genova</a>
</li>
<li>
<a class="savona">Savona</a>
</li>
</ul>
</li>
<li>
<a class="region" title="lazio">Lazio</a>
<ul>
<li>
<a class="roma">Roma</a>
</li>
</ul>
</li>
</ul>
</li>
I need to extract a list of all the cities. I don't care about regions... I am using cheerio from node.js, but I added jquery to the tags since cheerio uses jquery-style selector (AFAIK...).
I have come with this partial solution, partially working (it only lists first region group cities...):
$('li[id="list"] li li').each(function(i, elem) {
console.log('city:', elem.children[0].next.children[0].data);
});
As you can see, I'm quite confused... :-(
Any clue?
Try:
$('li#list ul li ul li a').each(function() { console.error("City: "+$(this).html()); });
As noted below, the selector could be simply
$('li#list li li a')
.Note that when using an id selector, since by definition an id is unique, the selector does not need any other qualifiers such as the tag name. And usually you would only use
tagname[id...]
if you're matching a substring in the ID: ^ - at the beginning, $ - at the end or * - anywhere.Another approach would be to use
:not()
to exclude thea
elements that should be left out:You have several options for valid selectors you can use:
#list ul ul a
#list li li a
#list ul ul li a
AND to get all the cities in an array called
cities
you can use the jQuery.map()
method:Try this