Should References in Object-Oriented Programming L

2019-04-24 09:11发布

问题:

Null pointers have been described as the "billion dollar mistake". Some languages have reference types which can't be assigned the null value.

I wonder if in designing a new object-oriented language whether the default behavior should be for references to prevent being assigned null. A special version of the could then be used to override this behavior. For example:

MyClass notNullable = new MyClass();
notNullable = null; // Error!
// a la C#, where "T?" means "Nullable<T>"
MyClass? nullable = new MyClass();
nullable = null; // Allowed

So my question is, is there any reason not to do this in a new programming language?

EDIT:

I wanted to add that a recent comment on my blog pointed out that non-nullable types have a particular problem whenb used in Arrays. I also want to thank everyone for their useful insights. It is very helpful, sorry I could only choose one answer.

回答1:

The main obstruction I see to non-nullable reference types by default is that some portion of the programming community prefers the create-set-use pattern:

x = new Foo()
x.Prop <- someInitValue
x.DoSomething()

to overloaded constructors:

x = new Foo(someInitValue)
x.DoSomething()

and this leaves the API designer in a bind with regards to the initial value of instance variables that might otherwise be null.

Of course, like 'null' itself, the create-set-use pattern itself creates lots of meaningless object states and prevents useful invariants, so being rid of this is really a blessing rather than a curse. However it does affect a bit of API design in a way that many people will be unfamiliar with, so it's not something to do lightly.

But overall, yes, if there is a great cataclysm that destroys all existing languages and compilers, one can only hope that when we rebuild we will not repeat this particular mistake. Nullability is the exception, not the rule!



回答2:

I like the Ocaml way of dealing with the 'maybe null' issue. Whenever a value of type 'a might be unknown/undefined/unitialized, it is wrapped in an 'a Option type, which can be either None or Some x, where x is the actual non-nullable value. When accessing the x you need to use the matching mechanism for unwrapping. Here is a function that increases a nullable integer and returns 0 on None

>>> let f = function  Some x -> x+1 | None->0 ;;
val f : int option -> int = <fun>

How it works:

>>> f Some 5 ;;
- : int = 6
>>> f None ;;
- : int = 0

The matching mechanism sort of forces you to consider the None case. Here's what happens when you forget it:

 >>> let f = function  Some x -> x+1 ;;
 Characters 8-31:
 let f = function  Some x -> x+1 ;;
         ^^^^^^^^^^^^^^^^^^^^^^^
 Warning P: this pattern-matching is not exhaustive.
 Here is an example of a value that is not matched:
 None
 val f : int option -> int = <fun>

(This is just a warning, not an error. Now if you pass None to the function you'll get a matching exception.)

The variant types + matching is a generic mechanism, it also works for things like matching a list with head :: tail only (forgetting the empty list case).



回答3:

Even better, disable null references. In rare cases when "nothing" is a valid value, there could be an object state that corresponds to it, but a reference would still point to that object, not have a zero value.



回答4:

As I understand, Martin Odersky's rationale for including null in Scala is to easily use Java libraries (i.e. so all your api's don't appear to have, e.g., "Object?" all over the place):

http://www.artima.com/scalazine/articles/goals_of_scala.html

Ideally, I think null should be included in the language as a feature, but non-nullable should be the default for all types. It would save lots of time and prevent errors.



回答5:

The biggest "null-related mistake" in language design is the lack of a trap when indexing null pointers. Many compiler will trap when trying to dereference a null pointer will not trap if one adds an offset to a pointer and tries to dereference that. In the C standard, trying to add the offset is Undefined Behavior, and the performance cost of checking the pointer there would be no worse than checking the dereference (especially if the compiler could realize that if it checked that the pointer was non-null before adding the offset, it might not need to re-check afterward).

As for language support for non-nullable variables, it may be useful to have a means of requesting that certain variables or fields whose declarations include an initial value should automatically test any writes to ensure that an immediate exception will occur if an attempt is made to write null to them. Arrays could include a similar feature, if there were an efficient idiomatic way of constructing an array by constructing all the elements and not making the array object itself available before construction was complete. Note that there should probably also be a way of specifying a cleanup function to be called on all previously-constructed elements if an exception occurs before all elements have been constructed.

Finally, it would be helpful if one could specify that certain instance members should be invoked with non-virtual calls, and should be invokable even on null items. Something like String.IsNullOrEmpty(someStringVariable) is hideous compared with someStringVariable.IsNullOrEmpty.



回答6:

Null is only a problem because developers don't check that something is valid before using it, but, if people start to misuse the new nullable construct it will not have solved any real problems.

It is important to just check that every variable that can be null is checked before it is used, and if this means that you have to use annotations to allow bypassing the check then that may make sense, otherwise the compiler could fail to compile until you check.

We put more and more logic into compilers to protect developers from themselves, which is scary and very sad, as we know what we should do, and yet sometimes skip steps.

So, your solution will also be subject to abuse, and we will be back to where we started, unfortunately.

UPDATE:

Based on some comments here was a theme in my answers. I guess I should have been more explicit in my original answer:

Basically, if the goal is to limit the impact of null variables, then have the compiler throw an error whenever a variable is not checked for null, and if you want to assume that it will never be null, then require an annotation to skip the check. This way you give people the ability to assume, but you also make it easy to find all the places in the code that have the assumption, and in a code review it can be evaluated if the assumption is valid.

This will help to protect while not limiting the developer, but making it easy to know where it is assumed not to be null.

I believe we need flexibility, and I would rather have the compilation take longer than have something negatively impact the runtime, and I think my solution would do what is desired.



回答7:

No.

The state of uninitialized will exist in some fashion due to logical necessity; currently the denotation is null.

Perhaps a "valid but uninitialized" concept of an object can be designed in, but how is that significantly different? The semantics of "accessing an uninitialized object" still will exist.

A better route is to have a static-time checking that you do not access an object that is not assigned to(I can't think of something off the top of my head that would prevent that besides string evals).