Parsing HTML in Cakephp

2019-08-08 20:18发布

I started building a web crawler in CakePHP 2.2. The pages, the script is crawling is HTML pages, and I need to parse them, to get my values.

Have tried some different solutions, and looked on some open source things aswell, but not sure what the best way is to do this.

To figure out, which method I should use, I need your help.

1条回答
Emotional °昔
2楼-- · 2019-08-08 20:34

DomDocument is your best choice. There are some decent examples in php.net documentation for this module. If you can use other language such as ruby I have very good experience with hpricot, a jQuery like library for parsing html.

This question is related to Robust and Mature HTML Parser for PHP

查看更多
登录 后发表回答