Recursive use of Stream.flatMap()

2019-01-14 15:29发布

问题:

Consider the following class:

public class Order {

    private String id;

    private List<Order> orders = new ArrayList<>();

    @Override
    public String toString() {
        return this.id;
    }

    // getters & setters
}

NOTE: It is important to note that I cannot modify this class, because I'm consuming it from an external API.

Also consider the following hierarchy of orders:

Order o1 = new Order();
o1.setId("1");
Order o11 = new Order();
o11.setId("1.1");
Order o111 = new Order();
o111.setId("1.1.1");
List<Order> o11Children = new ArrayList<>(Arrays.asList(o111));
o11.setOrders(o11Children);

Order o12 = new Order();
o12.setId("1.2");
List<Order> o1Children = new ArrayList<>(Arrays.asList(o11, o12));
o1.setOrders(o1Children);

Order o2 = new Order();
o2.setId("2");
Order o21 = new Order();
o21.setId("2.1");
Order o22 = new Order();
o22.setId("2.2");
Order o23 = new Order();
o23.setId("2.3");
List<Order> o2Children = new ArrayList<>(Arrays.asList(o21, o22, o23));
o2.setOrders(o2Children);

List<Order> orders = new ArrayList<>(Arrays.asList(o1, o2));

Which could be visually represented this way:

1
1.1
1.1.1
1.2
2
2.1
2.2
2.3

Now, I want to flatten this hierarchy of orders into a List, so that I get the following:

[1, 1.1, 1.1.1, 1.2, 2, 2.1, 2.2, 2.3]

I've managed to do it by recursively using flatMap() (along with a helper class), as follows:

List<Order> flattened = orders.stream()
    .flatMap(Helper::flatten)
    .collect(Collectors.toList());

This is the helper class:

public final class Helper {

    private Helper() {
    }

    public static Stream<Order> flatten(Order order) {
        return Stream.concat(
            Stream.of(order), 
            order.getOrders().stream().flatMap(Helper::flatten)); // recursion here
    }
}

The following line:

System.out.println(flattened);

Produces the following output:

[1, 1.1, 1.1.1, 1.2, 2, 2.1, 2.2, 2.3]

So far so good. The result is absolutely correct.

However, after reading this question, I had some concerns regarding the usage of flatMap() within a recursive method. Particularly, I wanted to know how the stream was being expanded (if that's the term). So I modified the Helper class and used peek(System.out::println) to check this:

public static final class Helper {

    private Helper() {
    }

    public static Stream<Order> flatten(Order order) {
        return Stream.concat(
            Stream.of(order), 
            order.getOrders().stream().flatMap(Helper::flatten))
        .peek(System.out::println);
    }
}

And the output was:

1
1.1
1.1
1.1.1
1.1.1
1.1.1
1.2
1.2
2
2.1
2.1
2.2
2.2
2.3
2.3

I'm not sure if this is the output that should be printed.

So, I wonder if it's OK to let intermediate streams contain repeated elements. Furthermore, what are the pros and cons of this approach? Is it correct, after all, to use flatMap() this way? Is there a better way to achieve the same?

回答1:

Well, I used the same pattern with a generic Tree class and didn’t have a wrong feeling with it. The only difference is, that the Tree class itself offered a children() and allDescendants() methods, both returning a Stream and the latter building on the former. This is related to “Should I return a Collection or a Stream?” and “Naming java methods that return streams”.

From a Streams perspective, there is no difference between a flatMap to children of a different type (i.e. when traversing a property) and a flatMap to children of the same type. There is also no problem if the returned stream contains the same element again, as there is no relationship between the elements of the streams. In principle, you can use flatMap as a filter operation, using the pattern flatMap(x -> condition? Stream.of(x): Stream.empty()). It’s also possible to use it to duplicate elements like in this answer.



回答2:

There is really no problem with you using flatMap in this way. Each of the intermediate steps in a stream are quite independent (by design) so there's no risk in your recursion. The main thing you need to watch out for is anything that might alter the underlying list while you are streaming it. In your case that does not seem to be a risk.

Ideally you would make this recursion part of the Order class itself:

class Order {
    private final List<Order> subOrders = new ArrayList<>();

    public Stream<Order> streamOrders() {
        return Stream.concat(
            Stream.of(this), 
            subOrders.stream().flatMap(Order::streamOrders));
    }
}

Then you can use orders.stream().flatMap(Order::streamOrders) which seems a bit more natural to me than using a helper class.

As a matter of interest, I tend to use these types of stream methods for allowing use of collection fields rather than a getter for the field. If the user of the method does not need to know anything about the underlying collection or need to be able to change it then returning a stream is convenient and safe.

I will note that there is one risk in your data structure you should be aware of: an order could be part of several other orders and can even be part of itself. This means that it's pretty trivial to cause infinite recursion and a stack overflow:

Order o1 = new Order();
o1.setOrders(Arrays.asList(o1));
o1.streamOrders();

There are lots of good patterns available to avoid these sorts of issues so please ask if you want some help in that area.

You point out that you can't alter the Order class. In that case I suggest you extend it to create your own safer version:

class SafeOrder extends Order {
    public SafeOrder(String id) {
        setId(id);
    }

    public void addOrder(SafeOrder subOrder) {
        getOrders().add(subOrder);
    }

    public Stream<SafeOrder> streamOrders() {
        return Stream.concat(Stream.of(this), subOrders().flatMap(SafeOrder::streamOrders));
    }

    private Stream<SafeOrder> subOrders() {
        return getOrders().stream().map(o -> (SafeOrder)o);
    }
}

This is a fairly safe cast because you expect users to use addOrder. Not foolproof as they could still call getOrders and add an Order rather than a SafeOrder. Again there are patterns to prevent that if you are interested.