It's common to use the end
keyword as a shortcut for accessing or extending an array in Matlab, as in
>> x = [1,2,3];
>> x(1:end-1)
ans =
1 2
>> x(end+1) = 4
x =
1 2 3 4
However, I was surprised to find that the following also works
>> x(1:min(5, end))
ans =
1 2 3 4
I thought that end
might be a special form, like :
, that can be special-cased in indexing operations, so I created a class to detect this
classdef IndexDisplayer
methods
function subsref(self, s)
disp(s);
end
end
end
You can see how :
is special cased in the following example
>> a = IndexDisplayer;
>> a(1:3)
type: '()'
subs: {[1 2 3]}
>> a(:)
type: '()'
subs: {':'}
However, when I index with end
I just see
>> a(end)
type: '()'
subs: {[1]}
Here the end
is replaced with a 1
. Where does that 1
come from? My first guess was that any end
inside an indexing expression x(end)
would be replaced with a call to length(x)
so I tried overriding length
as well
classdef IndexDisplayer
methods
function subsref(self, s)
disp(s);
end
function len = length(self)
len = 10;
end
end
end
However, that gives
>> a = IndexDisplayer;
>> length(a)
ans =
10
>> a(end)
type: '()'
subs: {[1]}
so that theory is out the window. Can anyone explain the semantics of end
?
Firstly, I think it's kind of a bug, or at least an unexpected feature, that your syntax x(1:min(5, end))
works at all. When I was at MathWorks, I remember someone pointing this out, and quite a few of the developers had to spend a while figuring out what was going on. I'm not sure if they ever really agreed whether it was a problem or not.
To explain the (intended) semantics of end
: end
is implemented as a function ind = end(obj, k, n)
. k
is the index of the expression containing end
, and n
is the total number of indices in the expression.
So, for example, when you call a(1,end,1)
, k
is 2, as the end
is in argument 2, and n
is 3 as there are 3 arguments.
ind
is returned as the index that can replace end
in the expression.
You can overload end
for your own classes (in the same way as you can overload colon
, size
, subsref
etc).
To extend your example:
classdef IndexDisplayer
methods
function ind = end(self,k,n)
disp(k)
disp(n)
ind = builtin('end', self, k, n);
end
end
end
>> a = IndexDisplayer;
>> a(1,end,1)
2
3
See here for more information.
I find this a curiosity too. Nevertheless, I often use (exploit?) this behavior to shorten statements. For example, in this answer, to get all but the k
th element(s) of a vector, a clean solution that occurred to me was,
vector(setdiff(1:end,k))
This end
replaces a call to numel(vector)
. For a scalar k
, this is an alternative to vector(1:end ~= k)
or vector([1:k-1 k+1:end])
. It seemed perfectly reasonable at the time, although I drew attention to the oddity of this usage. Is this really bad practice? Perhaps, but I've accepted it for what it's worth and move on.
I don't offer any insight into how this works or what the rules are, as Sam Roberts does in his answer, but conceptually, I see this as a matter of context. That is, when end
occurs, I would assume it evaluates to an index (or dimension subscript) for the array with the most immediate scope, looking "up" through nested statements to make the determination. Not sure if that is the right wording, but it seems to be a useful way to interpret the operation of end
.
I haven't been bitten by this interpretation yet.