How to parse (non well-formed) HTML in android?

2019-08-12 16:10发布

How to parse non well-formed HTML in android ?

I tried to use XOM and TagSoup, but i get the following error when creating the Builder:

11-26 20:42:39.294: ERROR/dalvikvm(1298): Could not find method org.apache.xerces.impl.Version.getVersion, referenced from method nu.xom.Builder.

Must i install Xerces to use XOM or can i use tagsoup without XOM ?

2条回答
The star\"
2楼-- · 2019-08-12 16:18

You might find JTidy (http://jtidy.sourceforge.net/) - a port of HTMLTidy to be sufficiently lightweight. It outputs XHTML on request

查看更多
神经病院院长
3楼-- · 2019-08-12 16:32

XOM may require Xerces to be in the classpath - it may depend on the version of Java. Currently we use

xercesImpl-2.8.0.jar
查看更多
登录 后发表回答