I want to create an XML file which will be used to store the structure of a Java program. I am able to successfully parse the Java program and create the tags as required. The problem arises when I try to include the source code inside my tags, since Java source code may use a vast number of entity reference and reserved characters like &
, <
,>
, &
. I am not able to create a valid XML.
My XML should go like this:
<?xml version="1.0"?>
<prg name="prg_name">
<class name= "class_name>
<parent>parent class</parent>
<interface>Interface name</interface>
.
.
.
<method name= "method_name">
<statement>the ordinary java statement</statement>
<if condition="Conditional Expression">
<statement> true statements </statement>
</if>
<else>
<statement> false statements </statement>
</else>
<statement> usual control statements </statement>
.
.
.
</method>
</class>
.
.
.
</prg>
Like this, but the problem is conditional expressions of if
or other statements have a lot of &
or other reserved symbols in them which prevents XML from getting validated. Since all this data (source code) is given by the user I have little control over it. Escaping the characters will be very costly in terms of time.
I can use CDATA to escape the element text but it can not be used for attribute values containing conditional expressions. I am using Antlr Java grammar to parse the Java program and getting the attributes and content for the tags. So is there any other workaround for it?
You will have to escape
for xml.
In XML attributes you must escape
if you wrap attribute values in double quotes (
"
), e.g.meaning tag
MyTag
with attributeattr
with textIf a<b & b<c then a<c, it's obvious
- note: no need to use'
to escape'
character.If you wrap attribute values in single quotes (
'
) then you should escape these characters:and you can write
"
as is. Escaping of>
with>
in attribute text is not required, e.g.<a b=">"/>
is well-formed XML.