I have a class like this:
class MultiDataPoint {
private DateTime timestamp;
private Map<String, Number> keyToData;
}
and i want to produce , for each MultiDataPoint
class DataSet {
public String key;
List<DataPoint> dataPoints;
}
class DataPoint{
DateTime timeStamp;
Number data;
}
of course a 'key' can be the same across multiple MultiDataPoints.
So given a List<MultiDataPoint>
, how do I use Java 8 streams to convert to List<DataSet>
?
This is how I am currently doing the conversion without streams:
Collection<DataSet> convertMultiDataPointToDataSet(List<MultiDataPoint> multiDataPoints)
{
Map<String, DataSet> setMap = new HashMap<>();
multiDataPoints.forEach(pt -> {
Map<String, Number> data = pt.getData();
data.entrySet().forEach(e -> {
String seriesKey = e.getKey();
DataSet dataSet = setMap.get(seriesKey);
if (dataSet == null)
{
dataSet = new DataSet(seriesKey);
setMap.put(seriesKey, dataSet);
}
dataSet.dataPoints.add(new DataPoint(pt.getTimestamp(), e.getValue()));
});
});
return setMap.values();
}
It's an interesting question, because it shows that there are a lot of different approaches to achieve the same result. Below I show three different implementations.
Default methods in Collection Framework: Java 8 added some methods to the collections classes, that are not directly related to the Stream API. Using these methods, you can significantly simplify the implementation of the non-stream implementation:
Stream API with flatten and intermediate data structure: The following implementation is almost identical to the solution provided by Stuart Marks. In contrast to his solution, the following implementation uses an anonymous inner class as intermediate data structure.
Stream API with map merging: Instead of flattening the original data structures, you can also create a Map for each MultiDataPoint, and then merge all maps into a single map with a reduce operation. The code is a bit simpler than the above solution:
You can find an implementation of the map merger within the Collectors class. Unfortunately, it is a bit tricky to access it from the outside. Following is an alternative implementation of the map merger:
To do this, I had to come up with an intermediate data structure:
With this in place, the approach is to "flatten" each MultiDataPoint into a list of (timestamp, key, data) triples and stream together all such triples from the list of MultiDataPoint.
Then, we apply a
groupingBy
operation on the string key in order to gather the data for each key together. Note that a simplegroupingBy
would result in a map from each string key to a list of the corresponding KeyDataPoint triples. We don't want the triples; we want DataPoint instances, which are (timestamp, data) pairs. To do this we apply a "downstream" collector of thegroupingBy
which is amapping
operation that constructs a new DataPoint by getting the right values from the KeyDataPoint triple. The downstream collector of themapping
operation is simplytoList
which collects the DataPoint objects of the same group into a list.Now we have a
Map<String, List<DataPoint>>
and we want to convert it to a collection of DataSet objects. We simply stream out the map entries and construct DataSet objects, collect them into a list, and return it.The code ends up looking like this:
I took some liberties with constructors and getters, but I think they should be obvious.