可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I recently read an interview with Lua co-creators Luiz H. de Figueredo and Roberto Ierusalimschy, where they discussed the design, and implementation of Lua. It was very intriguing to say the least. However, one part of the discussion brought something up in my mind. Roberto spoke of Lua as a "freestanding application" (that is, it's pure ANSI C that uses nothing from the OS.) He said, that the core of Lua was completely portable, and because of its purity has been able to be ported much more easily and to platforms never even considered (such as robots, and embedded devices.)
Now this makes me wonder. C in general is a very portable language. So, what parts of C (namely those in the the standard library) are the most unportable? and what are those that can be expected to work on most platforms? Should only a limited set of data types be used (e.g. avoiding short
and maybe float
)? What about the FILE
and the stdio
system? malloc
and free
? It seems that Lua avoids all of these. Is that taking things to the extreme? Or are they the root of portability issues? Beyond this, what other things can be done to make code extremely portable?
The reason I'm asking all of this, is because I'm currently writing an application in pure C89, and it's optimal that it be as portable as possible. I'm willing take a middle road in implementing it (portable enough, but no so much that I have to write everything from scratch.) Anyways, I just wanted to see what in general is key to writing the best C code.
As a final note, all of this discussion is related to C89 only.
回答1:
In the case of Lua, we don't have much to complain about the C language itself but we have found that the C standard library contains many functions that seem harmless and straight-forward to use, until you consider that they do not check their input for validity (which is fine if inconveninent). The C standard says that handling bad input is undefined behavior, allowing those functions to do whatever they want, even crash the host program. Consider, for instance, strftime. Some libc's simply ignore invalid format specifiers but other libc's (e.g., in Windows) crash! Now, strftime is not a crucial function. Why crash instead of doing something sensible? So, Lua has to do its own validation of input before calling strftime and exporting strftime to Lua programs becomes a chore. Hence, we have tried to stay clear from these problems in the Lua core by aiming at freestanding for the core. But the Lua standard libraries cannot do that, because their goal is to export facilities to Lua programs, including what is available in the C standard library.
回答2:
"Freestanding" has a particular meaning in the context of C. Roughly, freestanding hosts are not required to provide any of the standard libraries, including the library functions malloc
/free
, printf
, etc. Certain standard headers are still required, but they only define types and macros (for example stddef.h
).
回答3:
C89 allows two types of compilers: hosted and freestanding. The basic difference is that a hosted compiler provides all of the C89 library, while a freestanding compiler need only provide <float.h>
, <limits.h>
, <stdarg.h>
, and <stddef.h>
. If you limit yourself to these headers, your code will be portable to any C89 compiler.
回答4:
This is a very broad question. I'm not going to give the definite answer, instead I'll raise some issues.
Note that the C standard specifies certain things as "implementation-defined"; a conforming program will always compile on and run on any conforming platform, but it may behave differently depending on the platform. Specifically, there's
- Word size.
sizeof(long)
may be four bytes on one platform, eight on another. The sizes of short
, int
, long
etc. each have some minimum (often relative to each other), but otherwise there are no guarantees.
- Endianness.
int a = 0xff00; int b = ((char *)&a)[0];
may assign 0
to b
on one platform, -1
on another.
- Character encoding.
\0
is always the null byte, but how the other characters show up depends on the OS and other factors.
- Text-mode I/O.
putchar('\n')
may produce a line-feed character on one platform, a carriage return on the next, and a combination of each on yet another.
- Signedness of char. It may or it may not be possible for a
char
to take on negative values.
- Byte size. While nowadays, a byte is eight bits virtually everywhere, C caters even to the few exotic platforms where it is not.
Various word sizes and endiannesses are common. Character encoding issues are likely to come up in any text-processing application. Machines with 9-bit bytes are most likely to be found in museums. This is by no means an exhaustive list.
(And please don't write C89, that's an outdated standard. C99 added some pretty useful stuff for portability, such as the fixed-width integers int32_t
etc.)
回答5:
Anything that is a part of the C89 standard should be portable to any compiler that conforms to that standard. If you stick to pure C89, you should be able to port it fairly easily. Any portability problems would then be due to compiler bugs or places where the code invokes implementation-specific behavior.
回答6:
C was designed so that a compiler may be written to generate code for any platform and call the language it compiles, "C". Such freedom acts in opposition to C being a language for writing code that can be used on any platform.
Anyone writing code for C must decide (either deliberately or by default) what sizes of int
they will support; while it is possible to write C code which will work with any legal size of int
, it requires considerable effort and the resulting code will often be far less readable than code which is designed for a particular integer size. For example, if one has a variable x
of type uint32_t
, and one wishes to multiply it by another y
, computing the result mod 4294967296, the statement x*=y;
will work on platforms where int
is 32 bits or smaller, or where int
is 65 bits or larger, but will invoke Undefined Behavior
in cases where int
is 33 to 64 bits, and the product, if the operands were regarded as whole numbers rather than members of an algebraic ring that wraps mod 4294967296, would exceed INT_MAX
. One could make the statement work independent of the size of int
by rewriting it as x*=1u*y;
, but doing so makes the code less clear, and accidentally omitting the 1u*
from one of the multiplications could be disastrous.
Under the present rules, C is reasonably portable if code is only used on machines whose integer size matches expectations. On machines where the size of int
does not match expectations, code is not likely to be portable unless it includes enough type coercions to render most of the language's typing rules irrelevant.