This is an example of a page that lists baseball stats for a selected player, defaulting to the most recent year (2014, soon to be 2015) http://www.koreabaseball.com/Record/Player/HitterDetail/Game.aspx?playerId=76325
The drop down list allows the user to selected years back to 2010, but doesn't not change the displayed url. How can I scrape all the available years, from each value in the drop down list?
I'm currently using Python and BeautifulSoup, but I'm willing to use whatever will get the job done.
<select name="ctl00$ctl00$cphContainer$cphContents$ddlYear"
onchange="javascript:setTimeout('__doPostBack(\'ctl00$ctl00$cphContainer$cphContents$ddlYear\',\'\')', 0)"
id="cphContainer_cphContents_ddlYear"
class="select02 mgt30">
<option value="2014">2014</option>
<option value="2013">2013</option>
<option selected="selected" value="2012">2012</option>
<option value="2011">2011</option>
<option value="2010">2010</option>
Do it in two steps:
ctl00$ctl00$cphContainer$cphContents$ddlYear
parameter which is responsible for the yearImplementation example for year 2013 (using
requests
andBeautifulSoup
):This prints the contents of all stats tables for 2013:
An example using Mechanize and Ruby. Modify the form field and submit.