JDK 1.6 and Xerces?

2019-01-22 16:17发布

问题:

In my current project, we target a JDK 1.6 Runtime environment. For legacy rasons, Xerces JAR files are bundled in the application.

These are no longer needed right? The JDK has (for a while) had XML parsing libraries bundled in the JDK?

回答1:

Bundling an XML parser has not been necessary since 1.4 when JAXP was added to the JRE. You should use JAXP and not directly call Xerces. Internally, the JRE bundles and uses Xerces anyways (with a "com.sun" prefix).



回答2:

These XML services plug in application environment using so-called "service provider" mechanism.

It works as follows:

  1. It tries to find system property that exactly points to factory class, that should be used. E.g. -Djavax.xml.parsers.SAXParserFactory=<some class>.
  2. If system property was not found FactoryFinder looks for property in special properties file. For example ${java.home}/lib/jaxp.properties.
  3. If file property was not found FactoryFinder looks for service description in class path META-INF/services/<some service>, e.g. META-INF/services/javax.xml.parsers.SAXParserFactory. It is a file that should contain factory class name for example org.apache.xerces.jaxp.SAXParserFactoryImpl.
  4. If there are no such files in class path java uses its default factory implementation.

So if you do not have system property pointing to evident factory class java will choose suitable implementation quietly.



回答3:

The parser in the JDK was a fork of Xerces, but it is very buggy. I would recommend production applications always to use the Apache version of the parser in preference. The bugs are rare, but they are unpredictable, and they don't only affect corner cases that aren't seen in real life; I've seen many cases where quite boring XML documents are being parsed, and corrupt data is passed to the application for attribute values. Sun/Oracle have shown no interest in fixing the problem. Use Apache Xerces every time.

UPDATE (2018)

The problems with the JDK version of Xerces seem to have been resolved in Java 8, as far as I can see, so this advice is out of date.



回答4:

Endorsed Standards Override Mechanism works just fine. Djava.endorsed.dirs=path_to_folder_containing_new_library_jars will resolve the issue with JDK 1.6.

I have verified the above solution in the context of Thymleaf. In some cases if you go for LEGACYHTML5 mode, and if you use NekoHtml parser for Autocorrecting the unclosed html tags, Neko has dependency on Xerces jars. Setting the classpath does not solve the problem.

Thanks s-n-ushakov.



标签: java xml legacy