可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

Is there a PHP class/library that would allow me to query an XHTML document with CSS selectors? I need to scrape some pages for data that is very easily accessible if I could somehow use CSS selectors (jQuery has spoiled me!). Any ideas?

回答1:

After Googling further (initial results weren't very helpful), it seems there is actually a Zend Framework library for this, along with some others:

DOM-Query
phpQuery
pQuery
QueryPath
Simple HTML DOM Parser
Ultimate Web Scraper Toolkit
Zend-Dom

回答2:

XPath is a fairly standard way to access XML (and XHTML) nodes, and provides much more precision than CSS.

回答3:

Another one:
http://querypath.org/

回答4:

A great one is a component of symfony 2, CssSelector\Parser^Introduction. It converts CSS selectors into XPath expressions. Take a look =)

Source code

回答5:

For jQuery users most interesting may be port of jQuery to PHP, which is phpQuery. Almost all sections of the library are ported. Additionally it contains WebBrowser plugin, which can be used for Web Scraping whole site's path/processes (eg accessing data available after logging in). It simply simulates web browser on the server (events and cookies too). Latest versions has experimental support for XML namespaces and CSS3 "|" selector.

回答6:

I ended up using PHP Query Lite, it's very simple and has all I need.

回答7:

For document parsing I use DOM. This can quite easily solve your problem if you know the tag name (in this example "div"):

 $doc = new DOMDocument();
 $doc->loadHTML($html);

 $elements = $doc->getElementsByTagName("div");
 foreach ($elements as $e){
  if ($e->getAttribute("class")!="someclass") continue;

  //its a div.classname
 }

Not sure if DOM lets you get all elements of a document at once... you might have to do a tree traversal.