What is this INSANE space character??? (google chr

2019-02-04 09:37发布

问题:

This is driving me absolutely, !&&%&$ insane... it defies everything that I can think of.

THIS character right here... " "

In between these quotes... open google chrome and inspect. You will see its a  ... normal right? Now right click and actually view the source of this stack overflow page. It's a regular space... (also, the character I copied was an actual space).

I could understand if it's some kind of rich text editor or something, but in the raw html source is a regular space, so what gives?

Here's just with hitting the space key (which works fine)... " ".

You can even copy it and paste it everywhere and wreak havoc and make chrome put   everywhere. Even though whats copied in your clipboard is just a SPACE.

I have these stupid characters show up everywhere randomly in my website and I have no idea where they come from, or WHY is google converting a SPACE into a nbsp;

I have tried inspecting the actual character code and it's a regular space from all things I can find...

Every single method I try shows it as a NORMAL space... so what gives?

If i use ruby and do " ".ord I get 32. If i do it with the broken space I also get 32.

Please help me im losing my mind.

edit: you can prove this... view source on this page and you will see two empty " " like normal. Now look in console and only the one will be a  , yet the raw source is identical.

Image for people not using chrome (this is looking at this very post via chrome dev tools):

Here's the HTML of the same text you see when you view source... no nbsp to be found.

回答1:

When I view this page's source in Internet Explorer, or download it directly from the server and view it in a text editor, the first space character in question is formatted like this in the actual HTML:

THIS character right here... " "

Notice the   entity. That is Unicode codepoint U+00A0 NO-BREAK SPACE. Chrome is just being nice and re-formatting it as   when inspecting the HTML. But make no mistake, it is a real non-breaking space, not Unicode codepoint U+0020 SPACE like you are expecting. U+00A0 is visually displayed the same as U+0020, but they are semantically different characters.

The second space character in question is formatted like this in the actual HTML:

<p>Here's just with hitting the space key (which works fine)... <code>" "</code>.</p>

So it is Unicode codepoint U+0020 and not U+00A0. Viewing the raw hex data of this page confirms that:



回答2:

It turns out the two seemingly identical whitespace characters are not the same character.

Behold:

var characters = ["a", "b", "c", "d", " "];

var typedSpace  = " ";
var copiedSpace = " ";

alert("Typed: " + characters.indexOf(typedSpace));   // -1
alert("Copied: " + characters.indexOf(copiedSpace)); // 4    
alert(typedSpace === copiedSpace);                   // false

JSFiddle

typedSpace.charCodeAt(0) returns 32, the &#32; classic space. Whereas copiedSpace.charCodeAt(0) returns 160, the &#160 AKA &nbsp; character.

The difference between the two is that a whole bunch of &#160; repeated after one another will hold their ground and create additional space between them, whereas a whole bunch of repeated &#32; characters will squish together into one space.

For instance:

A &#160;&#160;&#160;&#160;&#160; B results in: A       B

A &#32;&#32;&#32;&#32;&#32; B results in: A B

To convert the &#160; character with a &#32; character in a string, try this:

.replace(new RegExp(String.fromCharCode(160),"g")," ");

To the people in the future like myself that had to debug this from a high level all the way down to the character codes, I salute you.



回答3:

It is a non breaking space. &nbsp; is the entity used to represent a non-breaking space. It is essentially a standard space, the primary difference being that a browser should not break (or wrap) a line of text at the point that this &nbsp; occupies.

Most likely the character is being inserted by your HTML Editor. Could you give a more specific example in context?



回答4:

This is not actually an answer to the question but instead a tool that can be used to detect this special white space in the html of the pages of a website so we can proceed to locate and remove it.

The tool what basically does is:

  1. Fetches the content of a URL
  2. Looks for occurrences of chr(194).chr(160) in the HTML contents
  3. Replaces and highlights the ocurrences with something more visible

This way you can actually know where the spaces are and edit your page properly to remove them.

The online version of the tool can be found here:

http://tools.heavydots.com/nbsp-space-char-detect/

A working example can be seen with the url of this question that contains one ocurrence:

http://tools.heavydots.com/nbsp-space-char-detect/?url=http%3A%2F%2Fstackoverflow.com%2Fquestions%2F26962323%2Fwhat-is-this-insane-space-character-google-chrome&highlight=1&hstring=%7BNBSP%7D

There's a Github repo available if someone wants the code to run it locally:
https://github.com/HeavyDots/nbsp-space-char-detect

Hope someone finds it useful, for any feedback there's a comments section on the tool's page.

Updated 5th of January 2017

At our company blog we just wrote a funny post about this annoying white space. You're invited to drop by and read it! :-)

http://heavydots.com/blog/when-the-white-space-became-a-beast



回答5:

As the previous answers have mentioned, it's a non-breaking space (nbsp). On Macs, this character gets inserted when you accidentally press Alt + Space (most of the time, this happens when entering code that requires Alt for special characters, e.g. [ on a German keyboard layout).

To remap this key combination to a plain ol' SPACE character, you can change your default keybinding as suggested on Apple SE



回答6:

For whitespace, Press "Alt+0160" which is a character also.