I would like to get an answer pointing out the reasons why the following idea described below on a very simple example is commonly considered bad and know its weaknesses.
I have a sentence of words and my goal is to make every second one to uppercase. My starting point for both of the cases is exactly the same:
String sentence = "Hi, this is just a simple short sentence";
String[] split = sentence.split(" ");
The traditional and procedural approach is:
StringBuilder stringBuilder = new StringBuilder();
for (int i=0; i<split.length; i++) {
if (i%2==0) {
stringBuilder.append(split[i]);
} else {
stringBuilder.append(split[i].toUpperCase());
}
if (i<split.length-1) { stringBuilder.append(" "); }
}
When want to use java-stream the use is limited due the effectively-final or final variable constraint used in the lambda expression. I have to use the workaround using the array and its first and only index, which was suggested in the first comment of my question How to increment a value in Java Stream. Here is the example:
int index[] = {0};
String result = Arrays.stream(split)
.map(i -> index[0]++%2==0 ? i : i.toUpperCase())
.collect(Collectors.joining(" "));
Yeah, it's a bad solution and I have heard few good reasons somewhere hidden in comments of a question I am unable to find (if you remind me some of them, I'd upvote twice if possible). But what if I use AtomicInteger
- does it make any difference and is it a good and safe way with no side effects compared to the previous one?
AtomicInteger atom = new AtomicInteger(0);
String result = Arrays.stream(split)
.map(i -> atom.getAndIncrement()%2==0 ? i : i.toUpperCase())
.collect(Collectors.joining(" "));
Regardless of how ugly it might look for anyone, I ask for the description of possible weaknesses and their reasons. I don't care the performance but the design and possible weaknesses of the 2nd solution.
Please, don't match AtomicInteger with multi-threading issue. I used this class since it receives, increments and stores the value in the way I need for this example.
As I often say in my answers that "Java Stream-API" is not the bullet for everything. My goal is to explore and find the edge where is this sentence applicable since I find the last snippet quite clear, readable and brief compared to StringBuilder
's snippet.
Edit: Does exist any alternative way applicable for the snippets above and all the issues when it’s needed to work with both item and index while iteration using Stream-API?
The documentation of the java.util.stream
package states that:
Side-effects in behavioral parameters to stream operations are, in general, discouraged, as they can often lead to unwitting violations of the statelessness requirement, as well as other thread-safety hazards.
[...]
The ordering of side-effects may be surprising. Even when a pipeline is constrained to produce a result that is consistent with the encounter order of the stream source (for example, IntStream.range(0,5).parallel().map(x -> x*2).toArray()
must produce [0, 2, 4, 6, 8]
), no guarantees are made as to the order in which the mapper function is applied to individual elements, or in what thread any behavioral parameter is executed for a given element.
This means that the elements may be processed out of order, and thus your Stream
-solutions may produce wrong results.
This is (at least for me) a killer argument against your two Stream
-solutions.
By the process of elimination, we have only the "traditional solution" left. And honestly, I do not see anything wrong with this solution. If you want to get rid of the for
-loop, you could re-write this code using a foreach
-loop:
boolean toUpper = false; // 1st String is not capitalized
for (String word : splits) {
stringBuilder.append(toUpper ? word.toUpperCase() : word);
toUpper = !toUpper;
}
For a streamified (and as far as I know) correct solution, take a look at Octavian R.'s answer.
Your question wrt. the "limits of streams" is opinion-based.
The answer to the question(s) ends here. The rest is my opinion and should be regarded as such.
In Octavian R.'s solution, we create an artificial index-set through IntStream
, which is then used to access the String[]
. For me, this has a higher cognitive complexity than a simple for
- or foreach
-loop and I do not see any benefit in using streams instead of loops in this situation.
In Java, comparing with Scala, you must be inventive. One solution without mutation is this one:
String sentence = "Hi, this is just a simple short sentence";
String[] split = sentence.split(" ");
String result = IntStream.range(0, split.length)
.mapToObj(i -> i%2==0 ? split[i].toUpperCase():split[i])
.collect(Collectors.joining(" "));
System.out.println(result);
In Java streams you should avoid the mutation. Your solution with AtomicInteger it's ugly and it's a bad practice.
Kind regards!
As explained in Turing85’s answer, your stream solutions are not correct, as they rely on the processing order, which is not guaranteed. This can lead to incorrect results with parallel execution today, but even if it happens to produce the desired result with a sequential stream, that’s only an implementation detail. It’s not guaranteed to work.
Besides that, there is no advantage in rewriting code to use the Stream API with a logic that basically still is a loop, but obfuscated with a different API. The best way to describe the idea of the new APIs, is to say that you should express what to do but not how.
Starting with Java 9, you could implement the same thing as
String result = Pattern.compile("( ?+[^ ]* )([^ ]*)").matcher(sentence)
.replaceAll(m -> m.group(1)+m.group(2).toUpperCase());
which expresses the wish to replace every second word with its upper case form, but doesn’t express how to do it. That’s up to the library, which likely uses a single StringBuilder
instead of splitting into an array of strings, but that’s irrelevant to the application logic.
As long as you’re using Java 8, I’d stay with the loop and even when switching to a newer Java version, I would consider replacing the loop as not being an urgent change.
The pattern in the above example has been written in a way to do exactly the same as your original code splitting at single space characters. Usually, I’d encode “replace every second word” more like
String result = Pattern.compile("(\\w+\\W+)(\\w+)").matcher(sentence)
.replaceAll(m -> m.group(1)+m.group(2).toUpperCase());
which would behave differently when encountering multiple spaces or other separators, but usually is closer to the actual intention.