How to parse HTML and get CSS styles

2019-05-11 16:11发布

I need to parse HTML and find corresponding CSS styles. I can parse HTML and CSS separataly, but I can't combine them. For example, I have an XHTML page like this:

<html>
<head>
<title></title>
</head>
<body>
<div class="abc">Hello World</div>
</body>
</html>

I have to search for "hello world" and find its class name, and after that I need to find its style from an external CSS file. Answers using Java, JavaScript, and PHP are all okay.

4条回答
兄弟一词,经得起流年.
2楼-- · 2019-05-11 16:47

Using Java java.util.regex

String s = "<body>...<div class=\"abc\">Hello World</div></body>";
    Pattern p = Pattern.compile("<div.+?class\\s*?=\\s*['\"]?([^ '\"]+).*?>Hello World</div>", Pattern.CASE_INSENSITIVE);    Matcher m = p.matcher(s);
if (m.find()) {
    System.out.println(m.group(1));
}

prints abc

查看更多
Fickle 薄情
3楼-- · 2019-05-11 17:03

As i understood you have chance to parse style sheet from external file and this makes your task easy to solve. First try to parse html file with jsoup which supports jquery like selector syntax that helps you parse complicated html files easier. then check this previous solution to parse css file. Im not going to full solution as i state with these libraries all task done internally and the only thing you should do is writing glue code to combine these two.

查看更多
贪生不怕死
4楼-- · 2019-05-11 17:04

Similiar question Can jQuery get all CSS styles associated with an element?. Maybe css optimizers can do what you want, take a look at unused-css.com its online tool but also lists other tools.

查看更多
闹够了就滚
5楼-- · 2019-05-11 17:08

Use jsoup library in java which is a HTML Parser. You can see for example here
For example you can do something like this:

String html="<<your html content>>";
Document doc = Jsoup.parse(html);
Element ele=doc.getElementsContainingOwnText("Hello World").first.clone(); //get tag containing Hello world
HashSet<String>class=ele.classNames(); //gives you the classnames of element containing Hello world

You can explore the library further to fit your needs.

查看更多
登录 后发表回答