How should I properly use Selenium

2019-04-16 07:59发布

问题:

I'm trying to get one number from Yahoo Finance (http://finance.yahoo.com/quote/AAPL/financials?p=AAPL), Balance Sheet, Total Stockholder Equity. If I inspect the element I get this:

<span data-reactid=".1doxyl2xoso.1.$0.0.0.3.1.$main-0-Quote-Proxy.$main-0-Quote.0.2.0.2:1:$BALANCE_SHEET.0.0.$TOTAL_STOCKHOLDER_EQUITY.1:$0.0.0">119,355,000</span>

I would like to get, scrap the number: 119,355,000.

If I understand correctly, web page is coded in Java Script and I need to use Selenium to get to the desired number. My attempt (I'm complete beginner) is not working no matter what I do, Bellow are three of many attempts. I tried to use 'data-reactid' and few other tings and I'm running out of ideas :-)

elem = Browser.find_element_by_partial_link_text('TOTAL_STOCKHOLDER_EQUITY')
elem = browser.find_element_by_id('TOTAL_STOCKHOLDER_EQUITY') 
elem = browser.find_elem_by_id('TOTAL_STOCKHOLDER_EQUITY')

回答1:

Actually your all locator looks like invalid, try using find_element_by_css_selector as below :-

elem = browser.find_element_by_css_selector("span[data-reactid *= 'TOTAL_STOCKHOLDER_EQUITY']")

Note: find_element_by_partial_text is use to locate only a with paritially match of text content not their attribute text and find_element_by_id is use to locate any element with their id attribute which will match exactly with passing value.

Edited :- There are more elements found with the provided locator, so you should try to find exact row of Total Stockholder Equity means tr element then find all their td elements as below :-

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

browser = webdriver.Chrome()
browser.get('http://finance.yahoo.com/quote/AAPL/financials?p=AAPL')
browser.maximize_window()

wait = WebDriverWait(browser, 5) 

    try:
        #first try to find balance sheet link and click on it
        balanceSheet = wait.until(EC.element_to_be_clickable((By.XPATH, "//span[text() = 'Balance Sheet']")))
        balanceSheet.click() 

        #Now find the row element of Total Stockholder Equity
        totalStockRow = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "tr[data-reactid *= 'TOTAL_STOCKHOLDER_EQUITY']")))

        #Now find all the columns included with Total Stockholder Equity
        totalColumns = totalStockRow.find_elements_by_tag_name("td")

        #Now if you want to print single value just pass the index into totalColumns other wise print all values in the loop

        #Now print all values in the loop
        for elem in totalColumns:
             print elem.text
             #it will print value as 
             #Total Stockholder Equity
             #119,355,000
             #111,547,000
             #123,549,000
    except:
        print('Was not able to find the element with that name.')

Hope it helps...:)