C encoding of character constants

My programmer's instinct would say that a character constant in c (eg: 'x') is encoded using the machine character set from the machine on which it is compiled. However, the following exerpt is from "The C Programming Language: ANSI C Edition"

"A character constant is a sequence of one or more characters enclosed in single quotes, as in 'x'. The value of a character constant with only one character is the numeric value of the charachter in the machine's character set at execution time."

Emphasis on the last 3 words.

Can anyone explain why they would say "at execution time". Surely the character value is encoded in the compiled binary (or ELF, A.OUT...) ?

I was wondering, but couldn't come up with any logical explanaition for this, surely K & R knew what they were doing!

标签： c encoding binary character

4条回答

SAY GOODBYE

2楼-- · 2019-04-07 08:58

You will have to tell the compiler what system you are going to run the program on. It will then choose the proper encoding for the characters.

Of course, default is to run on a system similar to the one running the compiler. In that case the compile time and runtime character sets will be identical.

0人赞添加讨论(0) 举报

▲ chillily

3楼-- · 2019-04-07 08:58

C distinguishes source character set and execution character set, because your compiler could be a cross compiler, e.g on a PC for a mobile platform. Then the character set on the computer and the one on the target machine must not agree. Simplest example is the EOL encoding, that is different between the different common platforms on the market nowadays. The execution character set may also depend on "locales" and other knobs that are dynamically set by the user running the program.

0人赞添加讨论(0) 举报

成全新的幸福

4楼-- · 2019-04-07 09:03

In C language terms, data is encoded for a particular locale, and locales declare character sets. Programs have an execution character set. Text (string and character constants) compiled into the program will be represented in that execution character set. The program itself may convert text it reads from the character set of any locale to its own execution character set, and format text it generates according to the character set of any locale.

"The machine's character set at execution time" is badly worded, it implies things that don't exist or aren't true.

0人赞添加讨论(0) 举报

smile是对你的礼貌

5楼-- · 2019-04-07 09:12

Your problem seems to lie in the fact that you're confusing Character Set of the machine with Character Encoding used.

Read this http://www.microsoft.com/typography/unicode/cs.htm to understand what character set actually means. The problem at the time of KnR (2nd Edition) was that there were just too many computers, some manufactured for the local government and public. This caused different character sets popping up between two computers, so, 'A' on a US machine was a Cyrillic character(say Foo) on a Russian machine.

Hence character constants couldn't be TRUSTED. Thanks to the modern computer manufacturers now, most character sets in the machine are the same, and information exchange is simpler.

0人赞添加讨论(0) 举报

C encoding of character constants

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间