What is the best way to convert JSON to XML and back. For example, the below JSON

{
    "user": "gerry",
    "likes": [1, 2, 4],
    "followers": [
        {
            "name": "megan"
        },
        {
            "name": "pupkin"
        }
    ]
}

could be converted into XML like this (#1):

<?xml version="1.0" encoding="UTF-8" ?>
<user>gerry</user>
<likes>1</likes>
<likes>2</likes>
<likes>4</likes>
<followers>
    <name>megan</name>
</followers>
<followers>
    <name>pupkin</name>
</followers>

or like this (#2):

<?xml version="1.0" encoding="UTF-8"?>
<root>
   <likes>
      <element>1</element>
      <element>2</element>
      <element>4</element>
   </likes>
   <followers>
      <element>
         <name>megan</name>
      </element>
      <element>
         <name>pupkin</name>
      </element>
   </followers>
   <user>gerry</user>
</root>

In particular, the difference arises converting arrays. Object property conversion is quite trivial. I am also sure that there are other ways to convert JSON to XML.

So the question is: What is the best way? Are there any standards?

Another question: is there a way to express the conversion mapping itself in some mathematical form. Eg, is it possible to describe a mapping such that a conversion function when given the JSON object and the mapping object would know exactly which XML to produce. And reverse it, too.

XML_1 = convert(JSON, mapping_1)
XML_2 = convert(JSON, mapping_2)
JSON  = convert(XML_1, mapping_1)
JSON  = convert(XML_2, mapping_2)
JSON  = convert(XML_1, mapping_2) # Error!

回答1:

You're obviously interested in the theory behind data serialization. I'll try to explain using the following headings.

Problem with XML as a data serialization format
Why other formats are favoured
It's really about information and relationships

What I'm leading to is an introduction to the Semantic web and how it formats data in various different formats.

Problem with XML as a data serialization format

As you've discovered there a several ways to structure data in XML. This is because XML started life as a documentation markup. XML has no built in way to describe simple data structures like lists or hashes.

Not self describing

Here's a simpe example:

<data>
  <user name="gerry"/>
</data>

This can be deserialized as a simple hash:

data.user.name = "gerry"

or less obviously as a list of hashes:

data.user[0].name = "gerry"

Fact is a different XML document could be specifying multiple user tags:

<data>
  <user name="gerry"/>
  <user name="tom"/>
</data>

XML schema to the rescue

The solution to this problem was to design a separate schema specification that describes how the document is formatted:

<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="data">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="user" maxOccurs="unbounded" minOccurs="0">
          <xs:complexType>
            <xs:simpleContent>
              <xs:extension base="xs:string">
                <xs:attribute type="xs:string" name="name" use="optional"/>
              </xs:extension>
            </xs:simpleContent>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

The person tag is described as being a sequence of elements... So this enables XML parsers to store this information in a list construct.

This is the approach taken by many web service frameworks which process XML data. The message format is described in the WSDL/XML schema and the programming code that processes the message is generated automatically.

Why other formats are favoured

Formats like JSON and YAML are specifically designed to serialize data. They don't require schema documents in order to parse data unambiguously.

but... Even so.... JSON and YAML don't solve all problems. While the data is more obvious at first glance there are no standards for describing data structures....

Earlier I vilified XML schemas, but these can be really useful to determining whether a piece of data is programmatically usable (valid) or not. Even so an XML Schema does not tell me the relationship between one piece of data and another.

It's really about information and relationships

The Semantic web movement is an attempt to create a self describing and collaborative internet. Problem is (IMHO) the associated standards are complex and difficult to understand and apply. The place to start is RDF:

Introduction to RDF

It's designed as a generic information interchange format and cleverly works in manner that is independent of how data is actually serialized.

Example

Your simple example and expressed as RDF XML:

<?xml version="1.0"?>
<rdf:RDF xmlns:user="http://myspotontheweb.com/user/1.0/" xmlns:ex="http://myspotontheweb.com/example/user/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
    <rdf:Description rdf:about="http://myspotontheweb.com/example/user/1">
        <user:name>gerry</user:name>
        <user:likes>1</user:likes>
        <user:likes>2</user:likes>
        <user:likes>4</user:likes>
    </rdf:Description>
    <rdf:Description rdf:about="http://myspotontheweb.com/example/user/2">
        <user:name>tom</user:name>
        <user:likes>2</user:likes>
        <user:likes>4</user:likes>
        <user:likes>6</user:likes>
        <user:follows rdf:resource="http://myspotontheweb.com/example/user/1" />
    </rdf:Description>
    <rdf:Description rdf:about="http://myspotontheweb.com/example/user/3">
        <user:name>felix</user:name>
        <user:likes>3</user:likes>
        <user:likes>5</user:likes>
        <user:follows rdf:resource="http://myspotontheweb.com/example/user/1" />
    </rdf:Description>
</rdf:RDF>

Each item of data has a unique identifier and a custom set of attributes:

name
likes
follows : Used to link one RDF entity to another.

XML is just one way to express RDF, I prefer the more compact N3 RDF format:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix user: <http://myspotontheweb.com/user/1.0/> .
@prefix ex: <http://myspotontheweb.com/example/user/> .

ex:1 user:name "gerry" .
ex:1 user:likes "1" .
ex:1 user:likes "2" .
ex:1 user:likes "4" .

ex:2 user:name "tom" .
ex:2 user:likes "2" .
ex:2 user:likes "4" .
ex:2 user:likes "6" .
ex:2 user:follows ex:1 .

ex:3 user:name "felix" .
ex:3 user:likes "3" .
ex:3 user:likes "5" .
ex:3 user:follows ex:1 .

Again note the custom prefix declaration at the top and the clear statement of what each piece of data ("tuple" in RDF parlance) represents. I think this demonstrates it's about information not data format!

And for completeness the RDF information presented in JSON-LD format:

{
  "@graph": [
    {
      "@id": "http://myspotontheweb.com/example/user/3",
      "http://myspotontheweb.com/user/1.0/follows": {
        "@id": "http://myspotontheweb.com/example/user/1"
      },
      "http://myspotontheweb.com/user/1.0/likes": [
        "3",
        "5"
      ],
      "http://myspotontheweb.com/user/1.0/name": "felix"
    },
    {
      "@id": "http://myspotontheweb.com/example/user/2",
      "http://myspotontheweb.com/user/1.0/follows": {
        "@id": "http://myspotontheweb.com/example/user/1"
      },
      "http://myspotontheweb.com/user/1.0/likes": [
        "2",
        "6",
        "4"
      ],
      "http://myspotontheweb.com/user/1.0/name": "tom"
    },
    {
      "@id": "http://myspotontheweb.com/example/user/1",
      "http://myspotontheweb.com/user/1.0/likes": [
        "2",
        "4",
        "1"
      ],
      "http://myspotontheweb.com/user/1.0/name": "gerry"
    }
  ]
}

Notes:

There are multiple ways to express RDF as JSON See as JSON+RDF

Example graph

Once the information is expressed as RDF its relationships to other data entities can be graphed visually:

RDF just the beginning

The Semantic web goes a lot further, it only starts with RDF. There are XML schema-like standards for publishing well understood relationships between tuplies. Using these one can start to manipulate RDF data in very interesting ways.

I don't claim to be an expert in data processing. What I do acknowledge is that some very clever people have been looking at this problem for some time. The concepts are tough to learn, but worthwhile in order to better understand information theory.

回答2:

You will want to use some variation of these two tools json_decode() and PEAR::XML_Serializer