Selecting Element followed by text with Selenium W

2019-04-08 00:05发布

I am using Selenium WebDriver and the Python bindings to automate some monotonous WordPress tasks, and it has been pretty straightforward up until this point. I am trying to select a checkbox, but the only way that I can identify it is by the text following it. Here is the relevant portion of HTML:

<li id="product_cat-52">
    <label class="selectit">
       <input value="52" type="checkbox" name="tax_input[product_cat][]" id="in-product_cat-52"> polishpottery
    </label>
</li>

The only information that I have in my script to identify this checkbox is the string "polishpottery". Is there any way to select that checkbox knowing only the text that follows?

3条回答
我只想做你的唯一
2楼-- · 2019-04-08 00:46

I would recommend trying to find more ways to select the checkbox. For example, you could select the li tag based on its id using browser.find_element_by_id(id). You could also select based on the name using browser.find_element_by_name(name).

Alternatively, if you really can't, you can select for the text using selenium + BeautifulSoup.

soup = BeautifulSoup(browser.page_source)
text = soup.find('input', re.compile=" polishpottery")
checkbox = text.parent 
# it might not exactly be parent, but you can play around with
# navigating the tree.

Hope this helps!

查看更多
Fickle 薄情
3楼-- · 2019-04-08 00:51

As @sherwin-wu already said, you should find a way to select what you want based on id or name or class (and most likely a combination of it). In your example there seem to be enough possibilities to do so, although I don't know what the rest of the page normally looks like.

Having that said, it's possible to do what you asked for using an XPath selector like

driver.find_element_by_xpath("//li/label/input[contains(..,'polishpottery')]")
查看更多
叼着烟拽天下
4楼-- · 2019-04-08 00:53

Regular expressions -- probably not the best solution, but it should work.

import re

def get_id(str, html_page): # str in this case would be 'polishpottery'
    return re.search(r'<input[^<>]*?type="checkbox"[^<>]*?id="([A-Za-z0-9_ -]*?)"[^<>]*?> ?' + str, html_page).group(1)

id = get_id('polishpottery', html)
checkbox = driver.find_element_by_id(id)
checkbox.toggle()

# Or, more minimallistically:
driver.find_element_by_id(get_id('polishpottery', html)).toggle()

Output:

>>> print(html)
<li id="product_cat-52">
    <label class="selectit">
       <input value="52" type="checkbox" name="tax_input[product_cat][]" id="in-product_cat-52"> polishpottery
    </label>
</li>
>>> get_id('polishpottery', html)
'in-product_cat-52'
查看更多
登录 后发表回答