I have a html file with three tables. But I want to extract only one table of the three. How do I do this?
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
A good module for extracting parts of a HTML document is HTML::Query
.
It provides a jQuery-like interface for selecting what part of a document to extract.
回答2:
You can do this using known Perl
modules like :
LWP
WWW::Mechanize
HTML::TreeBuilder
HTML::TreeBuilder::XPath
All are on http://search.cpan.org
The last Perl module is really usefull, you can use Xpath
expressions like :
//table[0]/tr[3]/td[2]/text()
by example, to print the text of the second td
element in the third tr
from the first table
.