I get the error when executing the following code for this HTML page, but this error only happens at /html/body/div[3]/div[1]/div[1]/div[1]/div/div[10]/a/div[1]/div[2]
:
WebDriverException: Message: {"errorMessage":"null is not an object
(near
'...ull).singleNodeValue.click();...')","request":{"headers":{"Accept":"application/json","Accept-Encoding":"identity","Connection":"close","Content-Length":"223","Content-Type":"application/json;charset=UTF-8","Host":"127.0.0.1:34955","User-Agent":"Python
http
auth"},"httpVersion":"1.1","method":"POST","post":"{\"sessionId\":
\"ddd5e2d0-10e4-11e8-8645-3d3d785f60f2\", \"args\": [], \"script\":
\"window.document.evaluate('/html/body/div[3]/div1/div1/div1/div/div[10]/a/div1/div[2]',
document, null, 9,
null).singleNodeValue.click();\"}","url":"/execute","urlParsed":{"anchor":"","query":"","file":"execute","directory":"/","path":"/execute","relative":"/execute","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/execute","queryKey":{},"chunks":["execute"]},"urlOriginal":"/session/ddd5e2d0-10e4-11e8-8645-3d3d785f60f2/execute"}}
Screenshot: available via screen
This is the code:
driver = webdriver.PhantomJS()
driver.implicitly_wait(20)
driver.set_window_size(1120, 550)
driver.get("https://topicolist.com/ongoing-ico")
num_options = len(driver.find_elements_by_class_name("w-dyn-item"))
for i in range(num_options):
xpath = "/html/body/div[3]/div[1]/div[1]/div[1]/div/div[" + str(i+1) + "]/a/div[1]/div[2]"
print xpath
execute_script(driver, xpath)
project_title = driver.find_elements_by_class_name("heading-49")[0].text.strip()
print project_title
time.sleep(10)
driver.back()
driver.quit()
def execute_script(driver, xpath):
execute_string = "window.document.evaluate('{}', document, null, 9, null).singleNodeValue.click();".format(xpath)
return driver.execute_script(execute_string)
A naive solution
You're using one query to count and another to iterate, producing different results. The naive way to solve this is to use the same query to count and iterate.
Which query? That depends on what you want, since your queries select different things (hence the error you're seeing):
- your CSS query fetches all
.w-dyn-item
elements; but those elements are spread across three .w-dyn-list
containers—one for .gold
, one for .silver
, and one for .bronze
—and, as it stands,
- your XPath query only fetches
.w-dyn-item
elements from the first container—.gold
.
If you want only the .gold
items, you'll have to adjust your count query:
num_options = len(driver.find_elements_by_css_selector(".w-dyn-list.gold .w-dyn-item"))
# ...
If you want all of the items, you'll have to adjust your iteration query:
for i in range(num_options):
xpath = "/html/body/div[3]/div[1]/div[1]/div/div/div[" + str(i+1) + "]/a/div[1]/div[2]"
# ...
A better solution
But you're doing a lot of work just to click an element. You don't need to use JavaScript; Selenium provides WebElement#click()
for just this purpose:
items = driver.find_elements_by_class_name("w-dyn-item")
for item in items:
item.find_element_by_xpath("./a/div/[1]/div[2]").click()
This is better, but the XPath query is still very specific and inflexible; if anything about the arrangement of the DOM tree in the list item changes, your query will break. Also, the XPath query also doesn't tell me what you're trying to click, making it impossible to tell why you're trying to click it.
Instead, since you're no longer sending XPath to the browser, you can use another CSS query to better express yourself in a more resilient way:
items = driver.find_elements_by_class_name("w-dyn-item")
for item in items:
item.find_element_by_class_name("description").click()
# ...
Now it's clear that you're trying to click on the description for each item. And because you're not specifying where the description appears, the site can change (within reason) without breaking your script.
The best solution
And by looking even closer at your script, one can see that the only reason you're clicking on the item description is so that you can navigate to the detail page to extract the project title. But that information is already present in the initial list: in the <h4>
element for each item.
Unless there's additional information you need that doesn't exist on the list page, you don't need to navigate to the detail page. Instead, just find the <h4>
elements and extract their text:
item_headings = driver.find_elements_by_css_selector(".w-dyn-item h4")
project_titles = [item_heading.text for item_heading in item_headings]
You can see in the page's HTML that there are only 10 div elements with the xpath
/html/body/div[3]/div[1]/div[1]/div[1]/div/div
While there are 73 elements with the class w-dyn-item
(no elements have only this class).
The result is that you try to iterate over 73 elements in an array of 10.