Populate a map conditionally using streams - Java

2019-04-10 02:44发布

问题:

I'm trying to translate this (simplified) code to use Java-8 streams:

Map<String, String> files = new ConcurrentHashMap<String, String>();

while(((line = reader.readLine()) != null) {
      if(content != null)
        files.put("not null"+line, "not null"+line);
      else
        files.put("its null"+line, "its null"+line);
    }
reader.close();

Here is what I've tried:

files = reader.lines().parallel().collect((content != null)?
                (Collectors.toConcurrentMap(line->"notnull"+line, line->line+"notnull")) :                                              
                (Collectors.toConcurrentMap(line->line+"null", line->line+"null")));

But the above gives a "cyclic inference" message for all the line->line+"..." on intelliJ. What is cyclic inference? Is there an error in this logic?

I noted some similar issues on SO. But they suggest to use interface(Map) instead of its implementations. But files here is declared as a Map.

Update: Adding more context, content is a String that holds the name of a directory. files is a map that holds multiple file paths. What file paths need to go into the files map depends on content directory-name is populated or not.

回答1:

Another way to fix this is to introduce the intermediate variable for collector:

Collector<String, ?, ConcurrentMap<String, String>> collector = (content != null) ?
        (Collectors.toConcurrentMap(line->"notnull"+line, line->line+"notnull")) :
        (Collectors.toConcurrentMap(line->line+"null", line->line+"null"));
Map<String, String> files = reader.lines().parallel().collect(collector);       

This solution (unlike one presented by @JanXMarek) does not allocate intermediate arrays and does not check the content for every input line.

The cyclic inference is the situation in the type inference procedure when to determine the type of inner subexpression, the type of outer subexpression must be determined, but it cannot be determined without knowing the type of inner subexpression. Type inference in Java-8 can infer that in case of Stream<String>.collect(Collectors.toConcurrentMap(line->line+"null", line->line+"null")) the type of Collector is Collector<String, ?, ConcurrentMap<String, String>>. Normally when subexpression type (here we're speaking about toConcurrentMap(...) subexpression) cannot be explicitly determined, it can be reduced using the outer context if the outer context is method invocation, cast or assignment. Here however the outer context is ?: operator which has its own complex type inference rules, so this becomes too much and you should help the type inference system specifying the explicit type somewhere.



回答2:

You can do it like this

reader.lines().parallel()
    .map(line -> content == null ?
            new String[]{"notnull"+line, line+"notnull"} :
            new String[]{line+"null", line+"null"})
    .collect(Collectors.toConcurrentMap(pair -> pair[0], pair -> pair[1]));

First, you map the line to a (key,value) pair stored in an array (or in some kind of a Pair object), and then, in the collector, you split it again into a key and a value.



回答3:

Just a side-note. I doubt that .parallel() is of any good in this context. If you are using the standard Java API for reading files, the iterator underneath will still read the file sequentially. The only thing that will be executed in parallel will be transforming the lines. I just tried it on my PC, out of my curiosity, and it was about 10% faster without .parallel().

Parallelisation makes sense if the processing is an order of magnitude slower than reading the input of the stream, which is not the case here.