PHP closing tag deletes the line feed

2020-02-10 05:27发布

I'm doing an experiment, an html preprocessor like SLIM or Jade.

This is the PHP code that seems right:

nav
  ul id: "test"
    li
      @<?= $Var; ?>
    li
      @About
    li
      @Contact

This is the expected pre-processed html (yes, $Var == "Test"):

nav
  ul id: "test"
    li
      @Test
    li
      @About
    li
      @Contact

However, in the browser I get this wrong text as the pre-processor html:

nav
  ul id: "test"
    li
      @Test    li
      @About
    li
      @Contact

Lastly, there are two ways to make it correct.

  1. Adding the break line manually:

    nav
      ul id: "test"
        li
          @<?= $Var . "\n"; ?>
      li
        @About
      li
        @Contact
    
  2. Writing a space after the PHP closing tag (??).

Why is the first case, <?= $Var; ?>, ignoring the line feed after the closing PHP tag? I couldn't really find anything since google brought too many results about why you should ignore the closing tag for every search I did and not what I wanted to find.

1条回答
爷的心禁止访问
2楼-- · 2020-02-10 06:15

Update:
Looking at the zend language scanner src, it would seem that my "hunch" was correct: the T_CLOSE_TAG token would appear to possibly contain a newline char. What's more, It'd also seem that a closing semi-colon for the last statement in a script that contains a closing tag is optional...

<ST_IN_SCRIPTING>("?>"|"</script"{WHITESPACE}*">"){NEWLINE}? {
    ZVAL_STRINGL(zendlval, yytext, yyleng, 0); /* no copying - intentional */
    BEGIN(INITIAL);
    return T_CLOSE_TAG;  /* implicit ';' at php-end tag */
}

Just look for T_CLOSE_TAG in the zend_language_scanner.c and zend_language_scanner.l files here


I'm currently scanning the source of the Zend engine, to be sure, but I'd guess that, since the last char(s) of the code you posted are, simply, the closing tag (?>), it's PHP that is generating the output. Seeing as you're not telling PHP to output a line-feed, it stands to reason that PHP won't add a new line to whatever you're echoing.
The line-feed char that follows the closing tag is, of course, ignored by PHP, but for some reason, PHP does indeed seem to consume that line feed. I'm looking at the C code that parses your PHP script, but I'm thinking it might use new-lines, whitespace, comma's semi-colons and all that as tokens to chunk the input into nodes.
Seeing as the closing tag ?> is a bona-fide token, and part of the PHP grammar, It could well be that this is where the line-feed is effectively consumed by the engine, and why it's not part of the output.

By adding a space char after the closing tag, The space might be consumed, but the new-line isn't, so that might be why you're still seeing the line-feed show up.
I've also tried adding 2 line feeds to some test code, and indeed: the output showed only 1 new line:

foo:
    <?= $bar; ?>

    foobar

Output:

foo:
    bar
    foobar

So it would seem that my suspicions might hold water.

However, all things considered, lest you want to go hacking away at the Zend engine source, adding the line-feed manually isn't that much of a hasstle. In fact, it's a good way to ensure the correct line-feeds are generated:
Suppose you wrote some code, on a healty *NIX system, where line-feeds are, to all intents and purposes represented by the \n escape sequence, adding that char manually might not yield the desired output on, say, a windows system (which uses \r\n), Apple systems use \r...
PHP has a constant to ensure you're churning out the correct line-feeds, depending on the platform your code is running on: PHP_EOL. Why not use that:

<?= $bar, PHP_EOL; ?>

In case you're wondering: yes, that is $bar comma PHP_EOL you're seeing there. Why? Think of echo or <?= as C++'s COUT, it's a construct that just pushes whatever you're throwing at it to the output stream, weather it be a concatenated string, or just a comma separated list of variables: it doesn't care.

Now, the following section of my answer is going slightly off-topic, but it's just something so basic, and self-evident, and yet many people are so un-aware of it, that I can't resist the temptation of explaining a thing or two about string concatenation.
PHP, and most any language I know of, doesn't care about how many vars/vals it has to push to the output stream. It's what it's for. PHP, and again: most languages, does care about concatenation of strings: A string is sort of a constant value. You can't just make a string longer when the mood takes you. A series of chars have to be stored in memory, memory that has to be allocated to accommodate a longer string. What concatenation effectively does (best case scenario), is this:

  • compute length of string1 and string2
  • allocate additional memory required to concat string2 onto string 1
  • copy string 2 to that newly (extra) allocated memory

Whereas, in a lot of cases, what actually happens is:

  • compute lengths of both strings
  • allocate memory, required to concat both strings
  • copy both strings to that newly allocated memory block
  • assign the new pointer to whatever variable needs assigning
  • free up any memory that isn't referenced anymore

An example of the first case:

$str1 = 'I am string constant 1';
$str2 = ' And I\'ll be concatenated';
$str1 .= $str2;

Could translate to the following C code:

char *str1, *str2;
//allocate mem for both strings, assign them their vals
str1 = realloc(str1,(strlen(str1) + strlen(str2)+1));//re-allocate mem for str1
strncat(str1, str2, strlen(str2);//concatenate str2 onto str1

However, by simply doing this:

$str3 = $str1 . $str2;

What you're actually doing is:

char *str3 = malloc((strlen(str1) + strlen(str2) + 1)*sizeof(char));
strcpy(str3, str1);//copy first string to newly allocated memory
strcat(str3, str2);//concatenate second string...

And as if that weren't enough, just think what this code implies:

$str1 = $str2 . $str1;

Yes, sure enough:

char *str3 = malloc((strlen(str1) + strlen(str2) + 1)*sizeof(char));
strcpy(str3, str2);//copy seconds string to start of new string
strcat(str3, str1);//add first string at the end
free(str1);//free memory associated with first string, because we're reassigning it
str1 = str3;//set str1 to point to the new block of memory

Now I haven't even gotten to the real concatenation nightmares yet (don't worry, I'm not going to either). Stuff like $foo = 'I ' . ' am '. 'The'. ' ' .$result.' of some'.1.' with a dot'.' fetish';. Look at it, there's variables in there, that might be anything (arrays, objects, huuuge strings..., there's an integer in there, too... replace the dots with comma's and pushing it to the echo construct just is so much easier than even begin contemplating writing the code required to correctly concatenate all of these values together...
Sorry for drifting off here slightly, but seeing as this is, IMO, so basic, I feel as though everyone should be aware of this...

查看更多
登录 后发表回答