Possible Duplicate:
Constant abuse?
I've seen -1
used in various APIs, most commonly when searching into a "collection" with zero-based indices, usually to indicate the "not found" index. This "works" because -1
is never a legal index to begin with. It seems that any negative number should work, but I think -1
is almost always used, as some sort of (unwritten?) convention.
I would like to limit the scope to Java at least for now. My questions are:
- What are the official words from Sun regarding using
-1
as a "special" return value like this? - What quotes are there regarding this issue, from e.g. James Gosling, Josh Bloch, or even other authoritative figures outside of Java?
- What were some of the notable discussions regarding this issue in the past?
The same practice as with
null
applies to-1
. Its been discussed many times.e.g. Java api design - NULL or Exception
Both Java and JavaScript use
-1
when an index isn't found. Since the index is always0-n
it seems a pretty obvious choice.I find this approach (which I've used when extending Array-type elements to have a
.indexOf()
method) to be quite normal.On the other hand, you can try the PHP approach e.g. strpos() but IMHO it gets confusing as there are multiple return types (it returns FALSE when not found)
-1 as a return value is slightly ugly but necessary. The alternatives to signal a "not found" condition are IMHO all much worse:
You could throw an Exception, but this isn't ideal because Exceptions are best used to signal unexpected conditions that require some form of recovery or propagated failure. Not finding an occurrence of a substring is actually pretty expected. Also Exception throwing has a significant performance penalty.
You could use a compound result object with (found,index) but this requires an object allocation and more complex code on the part of the caller to inspect the result.
You could separate out two separate function calls for contains and indexOf - however this is again quite cumbersome for the caller and also results in a performance hit as both calls would be O(n) and require a full traversal of the String.
Personally, I never like to refer to the -1 constant: my test for not-found is always something like:
Its used because its the first invalid value you encounter in 0-based arrays. As you know, not all types can hold null or nothing so need "something" to signify nothing.
I would say its not official, it has just become convention (unwritten) because its very sensible for the situation. Personally, I wouldn't also call it an issue. API design is also down to the author, but guidelines can be found online.
As far as I know, such values are called sentinel values, although most common definitions differ slightly from this scenario.
Languages such as Java chose to not support passing by reference (which I think is a good idea), so while the values of individual arguments are mutable, the variables passed to a function remain unaffected. As a consequence of this, you can only have one return value of only one type. So what you do is to chose an otherwise invalid value of a valid type, and return it to transport additional semantics, because the return value is not actually the return value of the operation but a special signal.
Now I guess, the cleanest approach would be to have a
contains
and anindexOf
method, the second of which would throw an exception, if the element you're asking for is not in the collection. Why? Because one would expect the following to be true:What you're likely to get is an exception because
-1
is out of bounds, while the actual reason why this plausible relation is not true is, thatsomeObject
is not an element ofsomeCollection
, and that is why the inner call should raise the exception.Now as clean and robust, as this may be, it has two key flaws:
In the end, it's up to you to decide. This is a matter of philosophy. I'd call it a "semantic hack" to achieve both shortness & speed at the cost of robustness. Your call ;)
greetz
back2dos
This is a common idiom in languages where the types do not include range checks. An "out of bounds" value is used to indicate one of several conditions. Here, the return value indicates two things: 1) was the character found, and 2) where was it found. The use of -1 for
not found
and a non-negative index forfound
succinctly encodes both of these into one value, and the fact thatnot-found
does not need to return an index.In a language with strict range checking, such as Ada or Pascal, the method might be implemented as (pseudo code)
Positive
is a subtype of int, but restricted to non-negative values.This separates the found/not-found flag from the position. The position is provided as an out parameter - essentialy another return value. It could also be an in-out parameter, to start the search from a given position. Use of -1 to indicate not-found would not be allowed here since it violates range checks on the Positive type.
The alternatives in java are:
boolean indexOf(char c); int lastFoundIndex();
. This implies the object must hold on to state, which will not work in a concurrent program, unless the state is stored in thread-local storage, or synchronization is used - all considerable overheads.boolean indexOf(char c, Position pos)
. Here, creating the position object may be seen as unnecessary overhead.such as
although it clearly separates the return values, it suffers object creation overhead. Some of that could be mitigated by passing the
FindIndex
as a parameter, e.g.Incidentally, multiple return values were going to be part of java (oak), but were axed prior to 1.0 to cut time to release. James Gosling says he wishes they had been included. It's still a wished-for feature.
My take is that use of magic values are a practical way of encoding a multi-valued results (a flag and a value) in a single return value, without requiring excessive object creation overhead.
However, if using magic values, it's much nicer to work with if they are consistent across related api calls. For example,
Java falls short here, since the use of -1 in the call to
substring
will cause anIndeOutOfBoundsException
. Instead, it might have been more consistent for substring to return "" when invoked with -1, if negative values are considered to start at the end of the string. Critics of magic values for error conditions say that the return value can be ignored (or assumed to be positive). A consistent api that handles these magic values in a useful way would reduce the need to check for -1 and allow for cleaner code.