Parsing Java String with SGML

2019-08-23 09:30发布

I have a Java String with SGML, something like this...

<misspell></misspell><plain>I</plain> <plain>know</plain> <plain>you</plain> <suggestion>ducky</suggestion> <plain>suck</plain> <plain>and</plain> <plain>I</plain> <plain>rocky</plain> <plain>rock</plain>

How do I parse it to get for instance say the text inside <suggestion> </suggestion>so as to get "ducky" out??

Will javax.swing.text.html.parser.Parse can be of any help? or I can only parse HTML docs with it?

2条回答
Anthone
2楼-- · 2019-08-23 10:13

The string you show is not HTML, but it could be parsed by an XML parser.

The SAX API is part of the JDK and AFAIK most XML parsers implement it.

查看更多
萌系小妹纸
3楼-- · 2019-08-23 10:20

try an html parser, they are (by necessity) quite forgiving of malformed markup and html is by nature based on SGML.

e.g. http://htmlparser.sourceforge.net/

查看更多
登录 后发表回答