Writing a custom XML file for the Wordpress Import

2019-03-04 00:01发布

问题:

Okay, so here is my current situation:

My knowledge of XML or lxml isn't very good yet, since I rarely used XML files until now. So please tell me if something in my approach to this is really stupid. ;-)

I want to feed my Wordpress installation a custom XML file, using the Wordpress importer. The Default Format can be seen here: XML File

Now there are some tags looking like this

<wp:author>

I am not a hundred percent sure, but as far as I learned today, the wp: part of the tag is the namespace.

When I tried to use lxml to create those Tags I did this

author = etree.Element("wp:author")

This caused an Error, because I am not allowed to write wp:author, but only author. I used Google, looked upon the lxml website, and came up with this:

WP = ElementMaker(namespace="http://wordpress.org/export/1.2/",
                  "nsmap={'wp' : "http://wordpress.org/export/1.2/"})
author = WP("author")

Output:

<wp:author xmlns:wp="http://wordpress.org/export/1.2/"/>

Well, better. The xmlns:wp belongs to the namespace stuff, as I learned today. But I don't want the xmlns:wp stuff to appear because it doesn't in their XML File. I looked up how Wordpress itself exports their content, and they do it like this:

echo '<wp:author_id>' . $author->ID . '</wp:author_id>';

Now my Question, is it better to do the same like them, or should I stick to lxml, as long as there is a way to get a tag without the xmlns:wp stuff? Using lxml to create XML files seems to be the better approach, because it seems to be (normally) pretty easy and is better to read.

I already tried objectify.deannotate, cleanup_namespace and similar suggestions but all of these don't work. I hope some of you have an answer, either to suggesting a solution to my problem using lxml or by saying, better to do it the way the Wordpress people do!

If I have overlooked an already answered similar Question, I am really sorry and please tell me so.

Thank you Vaelor

回答1:

Here is my advice: Take a step back from lxml and consider the python built-in support for xml processing: a module called xml.etree.ElementTree. Import it in repl like this:

import xml.etree.ElementTree as ET

and play with it for a while. Here is a good python documentation on the module: http://goo.gl/8FVto

Building an element is as simple as that:

a = ET.Element('wp:author')
ET.dump(a)

Then add some sub-elements. It's all in the docs.