I was wondering how stream().toArray[x -> new Integer[x]] knows what size of array to from? I wrote a snippet in which i created a list of an integer of size 4 and filtered the values and it created an array of length of the filtered stream, I could not see any method on stream to get a size of the stream.
List<Integer> intList = new ArrayList<Integer>();
intList.add(1);
intList.add(2);
intList.add(3);
intList.add(4);
Integer[] array = intList.stream()
.filter(x -> x > 2)
.toArray(x -> {
System.out.println("x --> " + x);
return new Integer[x];
});
System.out.println("array length: " + array.length);
Output of above code:
x --> 2
array length: 2
initially, the snippet was like
Integer[] array = intList.stream()
.filter(x -> x > 2)
.toArray(x -> new Integer[x]);
Just to get the understanding what value of x it passes i had to change it to print x in lambda
Of course, this is implementation dependent. For some streams, the size is predicable, if the source has a known size and no size changing intermediate operation is involved. Since you are using a filter
operation, this doesn’t apply, however, there is an estimate size, based on the unfiltered count.
Now, the Stream implementation simply allocates a temporary buffer, either using the estimated size or a default size with support for increasing the capacity, if necessary, and copies the data into the destination array, created by your function, in a final step.
The intermediate buffers could be created via the supplied function, which is the reason why the documentation states “…using the provided generator function to allocate the returned array, as well as any additional arrays that might be required for a partitioned execution or for resizing” and I vaguely remember seeing such a behavior in early versions. However, the current implementation just uses Object[]
arrays (or Object[][]
in a “spined buffer”) for intermediate storage and uses the supplied function only for creating the final array. Therefore, you can’t observe intermediate array creation with the function, given this specific JRE implementation.
The thing is: this is a terminal operation. It happens in the end, when the stream was processed: meaning - the "final" count is known by then; as there are no more operations that could remove/add values to the stream!
Simply look at javas stream documentation of toArray.
<A> A[] toArray(IntFunction<A[]> generator)
Returns an array containing the elements of this stream, using the provided generator function to allocate the returned array, as well as any additional arrays that might be required for a partitioned execution or for resizing.
This is a terminal operation.
API Note:
The generator function takes an integer, which is the size of the desired array, and produces an array of the desired size. This can be concisely expressed with an array constructor reference
Therefore toArray
does give you the desired array size as a parameter and you are responsible for allocating a correct sized array, at least when using this method. This method is a terminal operation. So the size calculation is done within the internals of the Stream API.
IMHO it is better to grasp if you name your lambda parameters differently for filter and toArray.
Integer[] array = intList.stream()
.filter(myint -> myint > 2)
.toArray(desiredArraySize -> new Integer[desiredArraySize]);