Markdown -> Showdown bug in detab regex?

2019-02-18 03:05发布

问题:

I'm looking at Gruber's original Markdown implementation here and the Showdown implementation here.

I'm comparing the _Detab function in each. I'm giving each the following string

"Where\tis pancakes house?"

The Perl version of the test and output is here. This is 26 characters long.

The JavaScript version of the test and output is here. This is 27 characters long.

      123456789012345678901234567
Perl: Where   is pancakes house?
  JS: Where    is pancakes house?

Have I made a mistake? Is it a bug, or is there some other purpose?

回答1:

There are several bugs in Showdown's detabber. That's why for Stack Overflow's version, I have rewritten it:

function _Detab(text) {
    if (!/\t/.test(text))
        return text;

    var spaces = ["    ", "   ", "  ", " "],
    skew = 0,
    v;

    return text.replace(/[\n\t]/g, function (match, offset) {
        if (match === "\n") {
            skew = offset + 1;
            return match;
        }
        v = (offset - skew) % 4;
        skew = offset + 1;
        return spaces[v];
    });
}

It detabs correctly, and if I recall my measurements correctly, this is about as fast (maybe a little slower) as the original in older IE versions, and much faster in newer browsers.

See http://code.google.com/p/pagedown/wiki/PageDown for our full version of Showdown.



回答2:

It looks like a bug in the Showdown implementation. Markdown uses 4-space tabs, so a string ending in a tab should always be a multiple of 4 characters long after tabs are converted to spaces. The Perl version makes "Where\t" 8 characters, but the JavaScript one makes it 9 characters.

I suspect the bug may not occur with tabs at the beginning of a line, which is how they're normally used in Markdown, which would explain why it hasn't been noticed.