I've been programming for quite a bit and recently started learning more pure Computer Science topics (for a job interview).
I know the difference between an Array and a LinkedList data structure, but now that I have started using Java I'm seeing this ArrayList, which I'm having trouble conceptualizing.
Web searches have only really shown me HOW to use them and WHEN to use them (benefits of each), but nothing can answer my question of:
What is an ArrayList? My assumption is that it is a list that maintains memory references to each element, making it also able to act like an array.
I also have a feeling since Java is open, that I should be able to look at the Class definition, but haven't figured out how to do that yet either.
Thanks!
I like to think of it as a data-structure that lets you enjoy both worlds, the quick-access to an index like with an array and the infinite growth of a list. Of course, there are always trade-offs.
ArrayList
is actually a wrapper to an array. Every time the size of the array ends, a new array, twice the size, is created and all the data from the original array is copied to the new one.
From the java doc:
Resizable-array implementation of the List interface. Implements all
optional list operations, and permits all elements, including null. In
addition to implementing the List interface, this class provides
methods to manipulate the size of the array that is used internally to
store the list. (This class is roughly equivalent to Vector, except
that it is unsynchronized.) The size, isEmpty, get, set, iterator, and
listIterator operations run in constant time. The add operation runs
in amortized constant time, that is, adding n elements requires O(n)
time. All of the other operations run in linear time (roughly
speaking). The constant factor is low compared to that for the
LinkedList implementation.
Each ArrayList instance has a capacity. The capacity is the size of
the array used to store the elements in the list. It is always at
least as large as the list size. As elements are added to an
ArrayList, its capacity grows automatically. The details of the growth
policy are not specified beyond the fact that adding an element has
constant amortized time cost.
An application can increase the capacity of an ArrayList instance
before adding a large number of elements using the ensureCapacity
operation. This may reduce the amount of incremental reallocation.
This allows O(1) access for most of the operations like it would take with an array. Once in a while you need to pay for this performance with an insert operation that takes much longer though.
This is called amortized complexity. Each operation takes only O(1) aside for those times you need to double the size of the array. In those time you would pay O(n) but if you average it over n operations, the average time taken is only O(1) and not O(n).
Let's take an example:
We have an array of size 100 (n=100). You make 100 insert operations (to different indices) and each of them takes only O(1), of course that all get-by-index operations also take O(1) (as this is an array). On the 101 insertion, there's no more more capacity in the array so the ArrayList will create a new array, the size of 200, copy all the values to it (O(n) operations) and then insert the 101st item. Until you fill out the array to 200 items, all of the operations would take O(1).
An ArrayList
is a list that is directly backed by an array. More specifically, it's backed by an array that is dynamically resized. You can read a bit more about it in its source code; there are some pretty good comments to it.
The reason that this is significant is due to how a LinkedList
is implemented - as a traditional collection of nodes and references to other nodes. This has performance impacts in indexing and traversal, whereas with an ArrayList
, since it's backed by an array, all one needs to do is index into the specific array to retrieve the value.