Java 8 streams and maps worth it?

2019-03-17 11:05发布

问题:

It feels like java 8 streams and mapping functions are so verbose they aren't really an improvement. For example, I wrote some code that uses a collection to generate another, modified collection:

private List<DartField> getDartFields(Class<?> model) {
    List<DartField> fields = new ArrayList<>();
    for (Field field : model.getDeclaredFields()) {
        if (!Modifier.isStatic(field.getModifiers())) {
            fields.add(DartField.getDartField(field));
        }
    }
    return fields;
}

This seems like the ideal use case for java 8 streams and their functions, so I rewrote it like that:

private List<DartField> getDartFields(Class<?> model) {
    return Arrays.asList(model.getDeclaredFields())
            .stream()
            .filter(field -> !Modifier.isStatic(field.getModifiers()))
            .map(field -> DartField.getDartField(field))
            .collect(Collectors.toList());
}

But I'm not sure I like that more. It's 236 characters as compared to 239 in normal-style java. It doesn't seem more or less readable. It's nice that you don't have to declare an ArrayList, but needing to call .collect(Collectors.toList()) and Arrays.asList (depending on the data type) isn't any better.

Is there some practical improvement to using .stream() like this that I just don't get, or is this just a fun way to throw any coworkers for a loop who don't know functional programming?

I suppose if I were dynamically passing around filter or map lambdas it would be useful, but if you don't need to do that ...

回答1:

The problem is that you are not using the Stream API consistently. You are restricting the use case to something which can be best described as “actually not using the Stream API” as you are insisting on returning a Collection. That’s especially absurd as it’s a private method so you are entirely able to adapt the callers as well.

Consider to change the method to

private Stream<DartField> getDartFields(Class<?> model) {
    return Stream.of(model.getDeclaredFields())
            .filter(field -> !Modifier.isStatic(field.getModifiers()))
            .map(field -> DartField.getDartField(field));
}

and look what the caller(s) actually want to do. Usually they don’t need a Collection as an end in itself, but want to perform an action or even more operations which could be chained, e.g. print them:

getDartFields(Foo.class).forEach(System.out::println);

The most interesting feature is the lazy nature of the stream, which implies that upon getDartFields return, no action has been performed yet and if you use operations like findFirst, there is no need to process all elements. You’ll lose this feature if you return a Collection containing all elements.

This also applies to multi-step processing where processing ordinary lists implies that for each step a new list has to be created and populated with results.



回答2:

You can write it differently (not necessarily better)

private List<DartField> getDartFields(Class<?> model) {
    return Stream.of(model.getDeclaredFields())
            .filter(field -> !Modifier.isStatic(field.getModifiers()))
            .map(DartField::getDartField)
            .collect(Collectors.toList());
}

Using static imports this looks like

private static List<DartField> getDartFields(Class<?> model) {
    return of(model.getDeclaredFields())
            .filter(field -> !isStatic(field.getModifiers()))
            .map(DartField::getDartField)
            .collect(toList());
}

It doesn't seem more or less readable.

This is often the case IMHO. However, I would say that in >10% of cases it is significantly better. Like any new feature, you will probably over use it to start with until you get familiar with it and find you use it the amount you feel comfortable with.

Is there some practical improvement to using .stream() like this that I just don't get, or is this just a fun way to throw any coworkers for a loop who don't know functional programming?

I suspect both. If you don't know functional programming, it tends to be read only code. i.e. you can still understand what it does, the problem is if you have to maintain it.

IMHO, it is worth encouraging developers to learn functional programming as it has some very useful ideas about how to structure your code and you would benefit from it even if you didn't use FP syntax.

Where the Streams API is useful in constructs you previously wouldn't have bother implementing.

E.g. say you want to index the field by name.

private static Map<String, DartField> getDartFields(Class<?> model) {
    return of(model.getDeclaredFields())
            .filter(field -> !isStatic(field.getModifiers()))
            .map(DartField::getDartField)
            .collect(groupingBy(f -> f.getName()));
}

In the past you might have used a List instead of a Map, but by making the assembly of Map easier you might use the data structure you really should be using more often.

Now lets see if it would be faster if we used more threads.

private static Map<String, DartField> getDartFields(Class<?> model) {
    return of(model.getDeclaredFields()).parallel()
            .filter(field -> !isStatic(field.getModifiers()))
            .map(DartField::getDartField)
            .collect(groupingByConcurrent(f -> f.getName()));
}

See how hard that was, and changing it back when you find it probably does more harm than good, is pretty easy too.



回答3:

Java 8 streams are particularly verbose, most due to converting to a stream and then back to another structure. In FunctionalJava, the equivalent is:

private List<DartField> getDartFields(Class<?> model) {
    return List.list(model.getDeclaredFields())
        .filter(field -> !Modifier.isStatic(field.getModifiers()))
        .map(field -> DartField.getDartField(field))
        .toJavaList();
}

I warn against just counting characters as a measure of complexity. This barely matters.

Functional programming allows you to reason about your code using a simple substitution model, rather than having to trace through your entire program. This makes your program more predictable and easier because you need fewer pieces of information in your head at once.

I also warn against returning streams. Streams are not arbitrarily composable, streams are mutable data where callers have no way of knowing if a terminal operation has been called on the stream. This means we need to know the state of the program to reason about what is happening. Streams were introduced to help eliminate mutable state, but are implemented using mutable state - far from ideal.

If you want an immutable stream, I recommend Functional Java's stream, https://functionaljava.ci.cloudbees.com/job/master/javadoc/fj/data/Stream.html.



回答4:

If you specifically constrain your use case to just what you have posted, then a Stream-based idiom is not substantially better. However, if you are interested to find out where the Streams API is a true benefit, here are some points:

  • the Stream-based idiom can be parallelized with no virtually effort on your part (this is actually the strongest reason why Java got lambdas in the first place);
  • Streams are composable: you can pass them around and add pipeline stages. This can greatly benefit code reuse;
  • as you already noted, you can also pass around lambdas: it is easy to write template methods where you plug in just one aspect of processing;
  • once you are comfortable with the idiom, FP code is actually more readable as it is more closely related to the what instead of the how. This advantage increases with the complexity of the processing logic.

I would additionally note that the difference in readability is more a historical artifact than intrinsic to the idioms: if developers were taught FP from the start and worked with it day-to-day, then it would be the imperative idiom which was odd and hard to follow.



回答5:

Not worth it if you insist on getting back to collections. However, you have missed an opportunity - consider the following and you should see where using streams adds a level of flexibility and composability to your code:

private static final Predicate<Field> isStatic
        = field -> !Modifier.isStatic(field.getModifiers());

private Stream<Field> getDeclaredFields(Class<?> model) {
    return Stream.of(model.getDeclaredFields());
}

private Stream<Field> getStaticFields(Class<?> model) {
    return getDeclaredFields(model).filter(isStatic);
}

private Stream<DartField> getDartFields(Class<?> model) {
    return getStaticFields(model)
            .map(field -> DartField.getDartField(field));
}

The point is that you can use streams as collections instead of mechanisms to build new collections.

By allowing all of the natural methods to just fall out of the algorithm you end up with patently obvious code that is almost inevitably reusable and each component naturally does its one thing.



回答6:

With Java 8 the team took an Object oriented programming language, and applied the "Objectification" to produce Functional Object oriented programming(lol...FOOP). It will take some time to get used to this, but I argue that any and all hierarchical Object manipulation should remain in its Functional state. From this perspective Java feels like it bridges the PHP gap; Allow the data to exist in its natural state, and mold it into the application GUI.

This is the true philosophy behind an API creation from a Software Engineering perspective.



回答7:

Here is a shorter solution by StreamEx

StreamEx.of(model.getDeclaredFields())
        .filter(field -> !Modifier.isStatic(field.getModifiers()))
        .map(DartField::getDartField)
        .toList();

I think it's shorter/simpler, comparing to original for loop.

List<DartField> fields = new ArrayList<>();
for (Field field : model.getDeclaredFields()) {
    if (!Modifier.isStatic(field.getModifiers())) {
        fields.add(DartField.getDartField(field));
    }
}
return fields;

The more important thing is more flexible. Jut think about if you want do more filter/map, or sort/limit/groupBy/..., you just need to add more stream API call, and code still keep concise, the nested for loop/if else will become more and more complicated.



回答8:

From my perspective, java stream APIs(map, filter, forEach, groupBy...) actually facility data handling in the process of daily development. Instead of getting your hands dirty, you just tell the stream APIs what you want not how to do.

However, I'm not comfortable when reading the java codes populated with various related stream APIs. Sometimes, it's very wired when using stream APIs in the code format and layout, especially along with the function program. Shortly, it degrades the readability.