可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I get the error when executing the following code for this HTML page, but this error only happens at /html/body/div[3]/div[1]/div[1]/div[1]/div/div[10]/a/div[1]/div[2]:

WebDriverException: Message: {"errorMessage":"null is not an object (near '...ull).singleNodeValue.click();...')","request":{"headers":{"Accept":"application/json","Accept-Encoding":"identity","Connection":"close","Content-Length":"223","Content-Type":"application/json;charset=UTF-8","Host":"127.0.0.1:34955","User-Agent":"Python http auth"},"httpVersion":"1.1","method":"POST","post":"{\"sessionId\": \"ddd5e2d0-10e4-11e8-8645-3d3d785f60f2\", \"args\": [], \"script\": \"window.document.evaluate('/html/body/div[3]/div1/div1/div1/div/div[10]/a/div1/div[2]', document, null, 9, null).singleNodeValue.click();\"}","url":"/execute","urlParsed":{"anchor":"","query":"","file":"execute","directory":"/","path":"/execute","relative":"/execute","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/execute","queryKey":{},"chunks":["execute"]},"urlOriginal":"/session/ddd5e2d0-10e4-11e8-8645-3d3d785f60f2/execute"}} Screenshot: available via screen

This is the code:

driver = webdriver.PhantomJS()
driver.implicitly_wait(20)
driver.set_window_size(1120, 550)
driver.get("https://topicolist.com/ongoing-ico")

num_options = len(driver.find_elements_by_class_name("w-dyn-item"))
for i in range(num_options):
    xpath = "/html/body/div[3]/div[1]/div[1]/div[1]/div/div[" + str(i+1) + "]/a/div[1]/div[2]"
    print xpath
    execute_script(driver, xpath)

    project_title = driver.find_elements_by_class_name("heading-49")[0].text.strip()
    print project_title

    time.sleep(10)
    driver.back()

driver.quit()


def execute_script(driver, xpath):
    execute_string = "window.document.evaluate('{}', document, null, 9, null).singleNodeValue.click();".format(xpath)

    return driver.execute_script(execute_string)

回答1:

A naive solution

You're using one query to count and another to iterate, producing different results. The naive way to solve this is to use the same query to count and iterate.

Which query? That depends on what you want, since your queries select different things (hence the error you're seeing):

your CSS query fetches all .w-dyn-item elements; but those elements are spread across three .w-dyn-list containers—one for .gold, one for .silver, and one for .bronze—and, as it stands,
your XPath query only fetches .w-dyn-item elements from the first container—.gold.

If you want only the .gold items, you'll have to adjust your count query:

num_options = len(driver.find_elements_by_css_selector(".w-dyn-list.gold .w-dyn-item"))
# ...

If you want all of the items, you'll have to adjust your iteration query:

for i in range(num_options):
    xpath = "/html/body/div[3]/div[1]/div[1]/div/div/div[" + str(i+1) + "]/a/div[1]/div[2]"
    # ...

A better solution

But you're doing a lot of work just to click an element. You don't need to use JavaScript; Selenium provides WebElement#click() for just this purpose:

items = driver.find_elements_by_class_name("w-dyn-item")
for item in items:
    item.find_element_by_xpath("./a/div/[1]/div[2]").click()

This is better, but the XPath query is still very specific and inflexible; if anything about the arrangement of the DOM tree in the list item changes, your query will break. Also, the XPath query also doesn't tell me what you're trying to click, making it impossible to tell why you're trying to click it.

Instead, since you're no longer sending XPath to the browser, you can use another CSS query to better express yourself in a more resilient way:

items = driver.find_elements_by_class_name("w-dyn-item")
for item in items:
    item.find_element_by_class_name("description").click()
    # ...

Now it's clear that you're trying to click on the description for each item. And because you're not specifying where the description appears, the site can change (within reason) without breaking your script.

The best solution

And by looking even closer at your script, one can see that the only reason you're clicking on the item description is so that you can navigate to the detail page to extract the project title. But that information is already present in the initial list: in the <h4> element for each item.

Unless there's additional information you need that doesn't exist on the list page, you don't need to navigate to the detail page. Instead, just find the <h4> elements and extract their text:

item_headings = driver.find_elements_by_css_selector(".w-dyn-item h4")
project_titles = [item_heading.text for item_heading in item_headings]

回答2:

You can see in the page's HTML that there are only 10 div elements with the xpath

/html/body/div[3]/div[1]/div[1]/div[1]/div/div

While there are 73 elements with the class w-dyn-item (no elements have only this class).

The result is that you try to iterate over 73 elements in an array of 10.

WebDriverException: Message: {“errorMessage”:\"nul

问题:

回答1:

A naive solution

A better solution

The best solution

回答2:

收藏的人(0)

WebDriverException: Message: {“errorMessage”:\"nul

问题:

回答1:

A naive solution

A better solution

The best solution

回答2:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮