What is the difference between UTF-8 and Unicode

2019-01-01 07:22发布

I have heard conflicting opinions from people - according to Wikipedia, see here.

They are the same thing, aren't they? Can someone clarify?

13条回答
孤独寂梦人
2楼-- · 2019-01-01 08:24

1. Unicode

There're lots of characters around the world,like "$,&,h,a,t,?,张,1,=,+...".

Then there comes an organization who's dedicated to these characters,

They made a standard called "Unicode".

The standard is like follows:

  • create a form in which each position is called "code point",or"code position".
  • The whole positions are from U+0000 to U+10FFFF;
  • Up until now,some positions are filled with characters,and other positions are saved or empty.
  • For example,the position "U+0024" is filled with the character "$".

PS:Of course there's another organization called ISO maintaining another standard --"ISO 10646",nearly the same.

2. UTF-8

As above,U+0024 is just a position,so we can't save "U+0024" in computer for the character "$".

There must be an encoding method.

Then there come encoding methods,such as UTF-8,UTF-16,UTF-32,UCS-2....

Under UTF-8,the code point "U+0024" is encoded into 00100100.

00100100 is the value we save in computer for "$".

查看更多
登录 后发表回答