I am trying to exclude certain elements from a list.
on the page http://www.persimmonhomes.com/rooley-park-10126 there are the elements I want to scrap which are (div class="housetype js-filter-housetype") and there are those I don't want to scrap which are (div class="housetype js-filter-housetype" style="display: none;")
the html looks something like:
<div class="housetype js-filter-housetype">
<div class="housetype js-filter-housetype">
<div class="housetype js-filter-housetype">
<div class="housetype js-filter-housetype">
<div class="housetype js-filter-housetype">
<div class="housetype js-filter-housetype" style="display: none;">
<div class="housetype js-filter-housetype" style="display: none;">
I am trying to write code to exclude the div class="housetype js-filter-housetype" style="display: none;".
My current code to do this is:
start_urls = [
"http://www.persimmonhomes.com/rooley-park-10126",
]
def parse(self, response):
for sel in response.xpath('//*[@id="aspnetForm"]/div[4]'):
item = PersimmonItem()
item['housetypeheading'] = sel.xpath('//*[@class="houses-list js-scrollable js-filterable js-houselist"]//*[not(@style="display: none;")]/h2[@class="housetype__heading"]').extract()
yield item
so far, this does not work. It just scraps all the elements whether or not it has the part (style="display: none;"). I have also tried the [not(contains(@style, "display: none;"))] - but so far no luck.
may i ask for any ideas?
If you want to ignore all with a style attribute:
Or that particular style, just use
and
: