I'm writing a library which takes xml files and parses them. To prevent users from feeding inalid xmls into my application i'm using xerces to validate the xml files via an xsd.
However, i only manages to validate against xsd-files. Theoretically an user could just open this file and mess around with it. That's why i would like my xsd to be hardcoded in my library.
Unfortunately i haven't found a way to do this with XercesC++, yet.
That's how it is working right now...
bool XmlParser::validateXml(std::string a_XsdFilename)
{
xercesc::XercesDOMParser domParser;
if (domParser.loadGrammar(a_XsdFilename.c_str(), xercesc::Grammar::SchemaGrammarType) == NULL)
{
throw Exceptions::Parser::XmlSchemaNotReadableException();
}
XercesParserErrorHandler parserErrorHandler;
domParser.setErrorHandler(&parserErrorHandler);
domParser.setValidationScheme(xercesc::XercesDOMParser::Val_Always);
domParser.setDoNamespaces(true);
domParser.setDoSchema(true);
domParser.setValidationSchemaFullChecking(true);
domParser.parse(m_Filename.c_str());
return (domParser.getErrorCount() == 0);
}
std::string m_Filename
is a member variable holding the path of the xml i validate.
std::string a_XsdFilename
is the path to the xsd i validate against.
XercesParserErrorHandler
inherits from xercesc::ErrorHandler
and does error handling.
How can i replace std::string a_XsdFilename
with something like std::string a_XsdText
?
Where std::string a_XsdText
contains the schema definition itself instead of a path to a file containing the schema definition.
I'll describe three ways of how to hardcode your XSD in your program:
- by loading the XSD from a file path (this is what your example program does right now)
- by loading the XSD from a string (this is what you ask for)
- by loading the XSD from a precompiled binary
Loading the XSD from a file path
Boris Kolpackov suggests in a blog post that applications should provide the XSD schema files by themselves rather than looking up the schema files through the xsi:schemaLocation or xsi:noNamespaceSchemaLocation attributes found in the XML file.
In the blog post there is a link to load-grammar-dom , an example program (put in the public domain) that makes use of the xercesc::DOMLSParser::loadGrammar function:
user@linux:~$ load-grammar-dom
usage: load-grammar-dom [test.xsd ... ] [test.xml ...]
user@linux:~$
Loading the XSD from a string
If you would like to pass the XSD file contents as a string, you would need to use another overload of
xercesc::DOMLSParser::loadGrammar
where you pass
const DOMLSInput *source
instead of
const char *const systemId
The DOMLSInput could be created with the help of xercesc::MemBufInputSource and xercesc::Wrapper4InputSource like this
xercesc::Wrapper4InputSource source(
new xercesc::MemBufInputSource(
(const XMLByte *) (a_XsdText.c_str()),
a_XsdText.size(),
"A name");
(Adapted somewhat from
https://stackoverflow.com/a/15829424/757777 but untested)
Loading the XSD from a precompiled binary
Included in the software CodeSynthesis XSD the embedded example (that is put in the public domain) demonstrates how to use
xercesc::BinInputStream and
xercesc::XMLGrammarPool::deserializeGrammars
to load a precompiled XSD schema.
See also README.
The example contains the program xsdbin
that compiles XSD schema files into a binary file.
user@linux:~$ xsdbin --help
Usage: xsdbin [options] <files>
Options:
--help Print usage information and exit.
--verbose Print progress information.
--output-dir <dir> Write generated files to <dir>.
--hxx-suffix <sfx> Header file suffix instead of '-schema.hxx'.
--cxx-suffix <sfx> Source file suffix instead of '-schema.cxx'.
--array-name <name> Binary data array name.
--disable-multi-import Disable multiple import support.
user@linux:~$
In the makefile the XSD schema file is precompiled by xsdbin and the result ends up inside the example executable.