How do I get python-markdown to additionally “urli

2019-03-14 04:37发布

问题:

Markdown is a great tool for formatting plain text into pretty html, but it doesn't turn plain-text links into URLs automatically. Like this one:

http://www.google.com/

How do I get markdown to add tags to URLs when I format a block of text?

回答1:

I couldn't get superjoe30's regular expression to compile, so I adapted his solution to convert plain URLs (within Markdown text) to be Markdown compatible.

The modified filter:

urlfinder = re.compile('^(http:\/\/\S+)')
urlfinder2 = re.compile('\s(http:\/\/\S+)')
@register.filter('urlify_markdown')
def urlify_markdown(value):
    value = urlfinder.sub(r'<\1>', value)
    return urlfinder2.sub(r' <\1>', value)

Within the template:

<div>
    {{ content|urlify_markdown|markdown}}
</div>


回答2:

You could write an extension to markdown. Save this code as mdx_autolink.py

import markdown
from markdown.inlinepatterns import Pattern

EXTRA_AUTOLINK_RE = r'(?<!"|>)((https?://|www)[-\w./#?%=&]+)'

class AutoLinkPattern(Pattern):

    def handleMatch(self, m):
        el = markdown.etree.Element('a')
        if m.group(2).startswith('http'):
            href = m.group(2)
        else:
            href = 'http://%s' % m.group(2)
        el.set('href', href)
        el.text = m.group(2)
        return el

class AutoLinkExtension(markdown.Extension):
    """
    There's already an inline pattern called autolink which handles 
    <http://www.google.com> type links. So lets call this extra_autolink 
    """

    def extendMarkdown(self, md, md_globals):
        md.inlinePatterns.add('extra_autolink', 
            AutoLinkPattern(EXTRA_AUTOLINK_RE, self), '<automail')

def makeExtension(configs=[]):
    return AutoLinkExtension(configs=configs)

Then use it in your template like this:

{% load markdown %}

(( content|markdown:'autolink'))

Update:

I've found an issue with this solution: When markdown's standard link syntax is used and the displayed portion matches the regular expression, eg:

[www.google.com](http://www.yahoo.co.uk)

strangely becomes: www.google.com

But who'd want to do that anyway?!



回答3:

Best case scenario, edit the markdown and just put < > around the URLs. This will make the link clickable. Only problem is it requires educating your users, or whoever writes the markdown.



回答4:

This isn't a feature of Markdown -- what you should do is run a post-processor against the text looking for a URL-like pattern. There's a good example in the Google app engine example code -- see the AutoLink transform.



回答5:

I was using the Django framework, which has a filter called urlize, which does exactly what I wanted. However, it only works on plain text, so I couldn't pass is through the output of markdown. I followed this guide to create a custom filter called urlify2 which works on html, and passed the text through this filter:

<div class="news_post">
  {% autoescape off %}
    {{ post.content|markdown|urlify2}}
  {% endautoescape %}
</div>

The urlify2.py filter:

from django import template
import re

register = template.Library()

urlfinder = re.compile("([0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}|((news|telnet|nttp|file|http|ftp|https)://)|(www|ftp)[-A-Za-z0-9]*\\.)[-A-Za-z0-9\\.]+):[0-9]*)?/[-A-Za-z0-9_\\$\\.\\+\\!\\*\\(\\),;:@&=\\?/~\\#\\%]*[^]'\\.}>\\),\\\"]")

@register.filter("urlify2")
def urlify2(value):
    return urlfinder.sub(r'<a href="\1">\1</a>', value)


回答6:

There's an extra for this in python-markdown2:

http://code.google.com/p/python-markdown2/wiki/LinkPatterns



回答7:

I know this question is almost a decade old, but markdown-urlize covers every possible use case I could think of including not requiring http(s):// before a url, leaving the parenthesis in (google.com), removing the angle brackets from <google.com>, ignoring urls in code blocks, and more I hadn't thought of:

https://github.com/r0wb0t/markdown-urlize

There's no pip install, but you can wget this:

https://raw.githubusercontent.com/r0wb0t/markdown-urlize/master/mdx_urlize.py

and then either put the above file on the python path (first option) or not (second option) and then use one of the following:

markdown.markdown(text, extensions=['urlize'], safe_mode=True)
markdown.markdown(text, extensions=['path.to.mdx_urlize'], safe_mode=True)