I want to find something by Xpath in a page (first project by Scrapy), for example the page https://github.com/rg3/youtube-dl/pull/11272.
In both my Opera inspect and firefox TryXpath add-on, this Xpath expression has the same result:
//div[@class='file js-comment-container js-resolvable-timeline-thread-container has-inline-notes']
and it is like this:
BUT in Scrapy 1.6 Xpath, when I want to get its result, it dose not find any thing and just return an empty list
def parse(self, response):
print(response.xpath('''//div[@class='file js-comment-container js-resolvable-timeline-thread-container has-inline-notes']'''))
and the result is just []
.
What do you think is the problem? and how can I fix it? thanks in advance.
NOTE: yes I know about robot.text and even ROBOTSTXT_OBEY = False
It would seem that some of those classes are being added by javascript.
However, if you're able to find a suitable selector, you're still able to select the divs you're trying to target, even if the javascript is not executed:
>>> fetch('https://github.com/rg3/youtube-dl/pull/11272')
2019-02-09 14:50:19 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://github.com/rg3/youtube-dl/pull/11272> (refere
r: None)
>>> response.css('div.file')
[<Selector xpath="descendant-or-self::div[@class and contains(concat(' ', normalize-space(@class), ' '), ' file ')]" dat
a='<div class="file js-comment-container js'>, <Selector xpath="descendant-or-self::div[@class and contains(concat(' ',
normalize-space(@class), ' '), ' file ')]" data='<div class="file js-comment-container js'>, <Selector xpath="descendant
-or-self::div[@class and contains(concat(' ', normalize-space(@class), ' '), ' file ')]" data='<div class="file js-comme
nt-container js'>, <Selector xpath="descendant-or-self::div[@class and contains(concat(' ', normalize-space(@class), ' '
), ' file ')]" data='<div class="file js-comment-container js'>, <Selector xpath="descendant-or-self::div[@class and con
tains(concat(' ', normalize-space(@class), ' '), ' file ')]" data='<div class="file js-comment-container js'>, <Selector
xpath="descendant-or-self::div[@class and contains(concat(' ', normalize-space(@class), ' '), ' file ')]" data='<div cl
ass="file js-comment-container js'>, <Selector xpath="descendant-or-self::div[@class and contains(concat(' ', normalize-
space(@class), ' '), ' file ')]" data='<div class="file js-comment-container js'>, <Selector xpath="descendant-or-self::
div[@class and contains(concat(' ', normalize-space(@class), ' '), ' file ')]" data='<div class="file js-comment-contain
er js'>, <Selector xpath="descendant-or-self::div[@class and contains(concat(' ', normalize-space(@class), ' '), ' file
')]" data='<div class="file js-comment-container js'>]
>>> len(_)
9