HTML webpages to Wiki pages translator

2020-06-24 08:34发布

问题:

I am looking for a HTML to wiki website translator. Basically I want to publish the coverage reports generated by cobertura to my google code website. But google code only suuports wiki pages, so if someone can point me to a HTML website to wiki pages (linked together) translator I can publish my coverage reports.

回答1:

There is a pretty good translator available here. It also supports the google code wiki syntax.

See if this can help you out.



回答2:

I'm not familiar with any such translators, but it wouldn't be difficult for you to hack up a quick wiki markup DOM seralizer on your own as a last resort.

Just write a function to parse the HTML using a DOM parser (My favorite is the LXML Python binding for libxml2) and serialize to wiki markup via depth-first traversal and then wrap the whole thing in a ready-made spidering framework. (Or whip your own up. That's not too difficult either.)

Something like this Python code: (Using StackOverflow markup as the example)

tags = {
    'b'       : {'start': '**', 'end': '**'},
    'em'      : {'start': '*', 'end': '*'},
    'i'       : {'start': '*', 'end': '*'},
    'strong'  : {'start': '**', 'end': '**'},
    // etc.
}

def serialize(node):
    tag = tags.get(node.tag, {})

    return ''.join([tag.get('start', ''), node.text or ''] +
                   [serialize(child) for child in node] +
                   [tag.get('end', ''), node.tail or ''])

wiki_markup = serialize(domRoot)

That took me maybe 5 minutes and I could probably implement the whole thing in under an hour.

I left out the more complicated bits for handling block markup (stuff where newlines, indentation, or line-starting characters are significant) and footnote-style link definition, but that's not very difficult... especially if you add an optional callback argument to the tag definition structure.

Really, the only time-consuming part is reinventing the Makefile-style "only update what's been changed" caching.



标签: html wiki