Why does/did C allow implicit function and typeles

2019-01-24 04:04发布

Why is it sensible for a language to allow implicit declarations of functions and typeless variables? I get that C is old, but allowing to omit declarations and default to int() (or int in case of variables) doesn't seem so sane to me, even back then.

So, why was it originally introduced? Was it ever really useful? Is it actually (still) used?

Note: I realise that modern compilers give you warnings (depending on which flags you pass them), and you can suppress this feature. That's not the question!


Example:

int main() {
  static bar = 7; // defaults to "int bar"
  return foo(bar); // defaults to a "int foo()"
}

int foo(int i) {
  return i;
}

5条回答
劫难
2楼-- · 2019-01-24 04:46

Probably the best explanation for "why" comes from here:

Two ideas are most characteristic of C among languages of its class: the relationship between arrays and pointers, and the way in which declaration syntax mimics expression syntax. They are also among its most frequently criticized features, and often serve as stumbling blocks to the beginner. In both cases, historical accidents or mistakes have exacerbated their difficulty. The most important of these has been the tolerance of C compilers to errors in type. As should be clear from the history above, C evolved from typeless languages. It did not suddenly appear to its earliest users and developers as an entirely new language with its own rules; instead we continually had to adapt existing programs as the language developed, and make allowance for an existing body of code. (Later, the ANSI X3J11 committee standardizing C would face the same problem.)

Systems programming languages don't necessarily need types; you're mucking around with bytes and words, not floats and ints and structs and strings. The type system was grafted onto it in bits and pieces, rather than being part of the language from the very beginning. As C has moved from being primarily a systems programming language to a general-purpose programming language, it has become more rigorous in how it handles types. But, even though paradigms come and go, legacy code is forever. There's still a lot of code out there that relies on that implicit int, and the standards committee is reluctant to break anything that's working. That's why it took almost 30 years to get rid of it.

查看更多
Anthone
3楼-- · 2019-01-24 05:00

A long, long time ago, back in the K&R, pre-ANSI days, functions looked quite different than they do today.

add_numbers(x, y)
{
    return x + y;
}

int ansi_add_numbers(int x, int y); // modern, ANSI C

When you call a function like add_numbers, there is an important difference in the calling conventions: all types are "promoted" when the function is called. So if you do this:

// no prototype for add_numbers
short x = 3;
short y = 5;
short z = add_numbers(x, y);

What happens is x is promoted to int, y is promoted to int, and the return type is assumed to be int by default. Likewise, if you pass a float it is promoted to double. These rules ensured that prototypes weren't necessary, as long as you got the right return type, and as long as you passed the right number and type of arguments.

Note that the syntax for prototypes is different:

// K&R style function
// number of parameters is UNKNOWN, but fixed
// return type is known (int is default)
add_numbers();

// ANSI style function
// number of parameters is known, types are fixed
// return type is known
int ansi_add_numbers(int x, int y);

A common practice back in the old days was to avoid header files for the most part, and just stick the prototypes directly in your code:

void *malloc();

char *buf = malloc(1024);
if (!buf) abort();

Header files are accepted as a necessary evil in C these days, but just as modern C derivatives (Java, C#, etc.) have gotten rid of header files, old-timers didn't really like using header files either.

Type safety

From what I understand about the old old days of pre-C, there wasn't always much of a static typing system. Everything was an int, including pointers. In this old language, the only point of function prototypes would be to catch arity errors.

So if we hypothesize that functions were added to the language first, and then a static type system was added later, this theory explains why prototypes are optional. This theory also explains why arrays decay to pointers when used as function arguments -- since in this proto-C, arrays were nothing more than pointers which get automatically initialized to point to some space on the stack. For example, something like the following may have been possible:

function()
{
    auto x[7];
    x += 1;
}

Citations

On typelessness:

Both languages [B and BCPL] are typeless, or rather have a single data type, the 'word,' or 'cell,' a fixed-length bit pattern.

On the equivalence of integers and pointers:

Thus, if p is a cell containing the index of (or address of, or pointer to) another cell, *p refers to the contents of the pointed-to cell, either as a value in an expression or as the target of an assignment.

Evidence for the theory that prototypes were omitted due to size constraints:

During development, he continually struggled against memory limitations: each language addition inflated the compiler so it could barely fit, but each rewrite taking advantage of the feature reduced its size.

查看更多
不美不萌又怎样
4楼-- · 2019-01-24 05:02

See Dennis Ritchie's "The Development of the C Language": http://cm.bell-labs.com/who/dmr/chist.html

For instance,

In contrast to the pervasive syntax variation that occurred during the creation of B, the core semantic content of BCPL—its type structure and expression evaluation rules—remained intact. Both languages are typeless, or rather have a single data type, the 'word', or 'cell', a fixed-length bit pattern. Memory in these languages consists of a linear array of such cells, and the meaning of the contents of a cell depends on the operation applied. The + operator, for example, simply adds its operands using the machine's integer add instruction, and the other arithmetic operations are equally unconscious of the actual meaning of their operands. Because memory is a linear array, it is possible to interpret the value in a cell as an index in this array, and BCPL supplies an operator for this purpose. In the original language it was spelled rv, and later !, while B uses the unary *. Thus, if p is a cell containing the index of (or address of, or pointer to) another cell, *p refers to the contents of the pointed-to cell, either as a value in an expression or as the target of an assignment.

This typelessness persisted in C until the authors started porting it to machines with different word lengths:

The language changes during this period, especially around 1977, were largely focused on considerations of portability and type safety, in an effort to cope with the problems we foresaw and observed in moving a considerable body of code to the new Interdata platform. C at that time still manifested strong signs of its typeless origins. Pointers, for example, were barely distinguished from integral memory indices in early language manuals or extant code; the similarity of the arithmetic properties of character pointers and unsigned integers made it hard to resist the temptation to identify them. The unsigned types were added to make unsigned arithmetic available without confusing it with pointer manipulation. Similarly, the early language condoned assignments between integers and pointers, but this practice began to be discouraged; a notation for type conversions (called `casts' from the example of Algol 68) was invented to specify type conversions more explicitly. Beguiled by the example of PL/I, early C did not tie structure pointers firmly to the structures they pointed to, and permitted programmers to write pointer->member almost without regard to the type of pointer; such an expression was taken uncritically as a reference to a region of memory designated by the pointer, while the member name specified only an offset and a type.

Programming languages evolve as programming practices change. In modern C and the modern programming environment, where many programmers have never written assembly language, the notion that ints and pointers are interchangeable may seem nearly unfathomable and unjustifiable.

查看更多
何必那么认真
5楼-- · 2019-01-24 05:02

Some food for thought. (It's not an answer; we actually know the answer — it's permitted for backward compatibility.)

And people should look at COBOL code base or f66 libraries before saying why it's not cleaned up in 30 years or so!

gcc with its default does not spit out any warnings.

With -Wall and gcc -std=c99 do spit out the correct thing

main.c:2: warning: type defaults to ‘int’ in declaration of ‘bar’
main.c:3: warning: implicit declaration of function ‘foo’

The lint functionality built into modern gcc is showing its color.

Interestingly the modern clone of lint, the secure lint — I mean splint — gives only one warning by default.

main.c:3:10: Unrecognized identifier: foo
  Identifier used in code has not been declared. (Use -unrecog to inhibit
  warning)

The llvm C compiler clang which also has a static analyser built into it like gcc, spits out the two warnings by default.

main.c:2:10: warning: type specifier missing, defaults to 'int' [-Wimplicit-int]
  static bar = 7; // defaults to "int bar"
  ~~~~~~ ^
main.c:3:10: warning: implicit declaration of function 'foo' is invalid in C99
      [-Wimplicit-function-declaration]
  return foo(bar); // defaults to a "int foo()"
         ^

People used to think we don't need backward compatibility for 80's stuff. All the code must be cleaned up or replaced. But it turns out it's not the case. A lot of production code stays in prehistoric non-standard times.

EDIT:

I didn't look through other answers before posting mine. I may have misunderstood the intention of the poster. But the thing is there was a time when you hand compiled your code, and use toggle to put the binary pattern in memory. They didn't need a "type system". Nor does the PDP machine in front of which Richie and Thompson posed like this :

Don't look at the beard, look at the "toggles", which I heard were used to bootstrap the machine.

K&R

And also look how they used to boot UNIX in this paper. It's from the Unix 7th edition manual.

http://wolfram.schneider.org/bsd/7thEdManVol2/setup/setup.html

The point of the matter is they didn't need so much software layer managing a machine with KB sized memory. Knuth's MIX has 4000 words. You don't need all these types to program a MIX computer. You can happily compare a integer with pointer in a machine like this.

I thought why they did this is quite self-evident. So I focused on how much is left to be cleaned up.

查看更多
神经病院院长
6楼-- · 2019-01-24 05:10

It's the usual story — hysterical raisins (aka 'historical reasons').

In the beginning, the big computers that C ran on (DEC PDP-11) had 64 KiB for data and code (later 64 KiB for each). There was a limit to how complex you could make the compiler and still have it run. Indeed, there was scepticism that you could write an O/S using a high-level language such as C, rather than needing to use assembler. So, there were size constraints. Also, we are talking a long time ago, in the early to mid 1970s. Computing in general was not as mature a discipline as it is now (and compilers specifically were much less well understood). Also, the languages from which C was derived (B and BCPL) were typeless. All these were factors.

The language has evolved since then (thank goodness). As has been extensively noted in comments and down-voted answers, in strict C99, implicit int for variables and implicit function declarations have both been made obsolete. However, most compilers still recognize the old syntax and permit its use, with more or less warnings, to retain backwards compatibility, so that old source code continues to compile and run as it always did. C89 largely standardized the language as it was, warts (gets()) and all. This was necessary to make the C89 standard acceptable.

There is still old code around using the old notations — I spend quite a lot of time working on an ancient code base (circa 1982 for the oldest parts) which still hasn't been fully converted to prototypes everywhere (and that annoys me intensely, but there's only so much one person can do on a code base with multiple millions of lines of code). Very little of it still has 'implicit int' for variables; there are too many places where functions are not declared before use, and a few places where the return type of a function is still implicitly int. If you don't have to work with such messes, be grateful to those who have gone before you.

查看更多
登录 后发表回答