I need to scrape data from a table on a web page. I'd then like to store this data in an array, so that I can later store it in a database. I'm very unfamiliar with this functionality, so I'd like to use the most simple method possible.
Which should I use? file_get_contents
, file_get_html
, cURL
?
- You can use
curl()
or file_get_contents()
to get the contents of the page.
- then, using the regular expressions to extract the content you need (
preg_match()
)
- finally ,insert the content to database.
You can using the crontab command (Linux: crontab -e)
to make the php script execute automatically.
My English is poor, so I hope anyone give me opinion. Thanks!
I prefer PHP Simple HTML Dom Parser:
http://simplehtmldom.sourceforge.net/
You can then loop through certain elements with their syntax. For example, to get the names of all the teams on the link you sent over, save it to an array and then do a MySQL insert statement, you'd do something like this:
$html = file_get_html('http://www.tablesleague.com/england/');
$name_array = array();
// Get all names
foreach($html->find('div.cell.name.no_border') as $element){
//Push the name to an array
array_push($name_array, $element->innertext);
}
Then prepare a MySQL statement:
foreach($name_array as $name){
$sql = "INSERT INTO table_name (name) VALUES ($name)";
$result = $mysqli->query($sql);
}
You could always create a multidimensional array with all the elements you'd like, pull them from the array when you loop through it and upload multiple items for every query.