Java generics - implementing higher order function

2019-03-27 05:37发布

问题:

I decided to write some common Higher Order Functions in Java (map, filter, reduce, etc.) that are type safe via generics, and I'm having problems with wildcards matching in one particular function.

Just to be complete, the functor interface is this:

/**
 * The interface containing the method used to map a sequence into another.
 * @param <S> The type of the elements in the source sequence.
 * @param <R> The type of the elements in the destination sequence.
 */
public interface Transformation<S, R> {

    /**
     * The method that will be used in map.
     * @param sourceObject An element from the source sequence.
     * @return The element in the destination sequence.
     */
    public R apply(S sourceObject);
}

The troubling function is like a map, but instead of transforming a Collection it transforms a Map (at first I thought it should be called mapMap, but it sounded so stupid that I ended up calling it remapEntries).

My first version was (and take a sit, because the signature is quite a monster):

    /**
     * <p>
     * Fills a map with the results of applying a mapping function to
     * a source map.
     * </p>
     * Considerations:
     * <ul>
     * <li>The result map must be non-null, and it's the same object what is returned
     * (to allow passing an unnamed new Map as argument).</li>
     * <li>If the result map already contained some elements, those won't
     * be cleared first.</li>
     * <li>If various elements have the same key, only the last entry given the
     * source iteration order will be present in the resulting map (it will
     * overwrite the previous ones).</li>
     * </ul>
     *
     * @param <SK> Type of the source keys.
     * @param <SV> Type of the source values.
     * @param <RK> Type of the result keys.
     * @param <RV> Type of the result values.
     * @param <MapRes>
     * @param f The object that will be used to remapEntries.
     * @param source The map with the source entries.
     * @param result The map where the resulting entries will be put.
     * @return the result map, containing the transformed entries.
     */
    public static <SK, SV, RK, RV, MapRes extends Map<RK, RV>> MapRes remapEntries(final Transformation<Map.Entry<SK, SV>, Map.Entry<RK,RV>> f, final Map<SK, SV> source, MapRes result) {
        for (Map.Entry<SK, SV> entry : source.entrySet()) {
            Map.Entry<RK, RV> res = f.apply(entry);
            result.put(res.getKey(), res.getValue());
        }
        return result;
    }

And it seems to be quite correct, but the problem is that the transformation used must match exactly the type parameters, making difficult to reuse map functions for types that are compatible. So I decided to add wildcards to the signature, and it ended up like this:

public static <SK, SV, RK, RV, MapRes extends Map<RK, RV>> MapRes remapEntries(final Transformation<? super Map.Entry<? super SK, ? super SV>, ? extends Map.Entry<? extends RK, ? extends RV>> f, final Map<SK, SV> source, MapRes result) {
    for (Map.Entry<SK, SV> entry : source.entrySet()) {
        Map.Entry<? extends RK, ? extends RV> res = f.apply(entry);
        result.put(res.getKey(), res.getValue());
    }
    return result;
}

But when I'm trying to test it, wildcard matching fails:

@Test
public void testRemapEntries() {
    Map<String, Integer> things = new HashMap<String, Integer>();
    things.put("1", 1);
    things.put("2", 2);
    things.put("3", 3);

    Transformation<Map.Entry<String, Number>, Map.Entry<Integer, String>> swap = new Transformation<Entry<String, Number>, Entry<Integer, String>>() {
        public Entry<Integer, String> apply(Entry<String, Number> sourceObject) {
            return new Pair<Integer, String>(sourceObject.getValue().intValue(), sourceObject.getKey()); //this is just a default implementation of a Map.Entry
        }
    };

    Map<Integer, String> expected = new HashMap<Integer, String>();
    expected.put(1, "1");
    expected.put(2, "2");
    expected.put(3, "3");

    Map<Integer, String> result = IterUtil.remapEntries(swap, things, new HashMap<Integer, String>());
    assertEquals(expected, result);
}

The error is:

method remapEntries in class IterUtil cannot be applied to given types
  required: Transformation<? super java.util.Map.Entry<? super SK,? super SV>,? extends java.util.Map.Entry<? extends RK,? extends RV>>,java.util.Map<SK,SV>,MapRes
  found: Transformation<java.util.Map.Entry<java.lang.String,java.lang.Number>,java.util.Map.Entry<java.lang.Integer,java.lang.String>>,java.util.Map<java.lang.String,java.lang.Integer>,java.util.HashMap<java.lang.Integer,java.lang.String>

So, any hints on how to fix this? Or should I give up and write explicit loops for this? ^_^

回答1:

I think you should take a look to Google Guava API.

There you can find a Function interface similar to your Transformation one. There is also a class Maps with utility methods to create or transform map instances.

You should also consider PECS when implementing methods for generics use.



回答2:

This is a difficult one. The following knowledge is totally useless and nobody should care to posses:

First thing to fix is the type of swap. The input type should not be Entry<String,Number>, because then it cannot accept Entry<String,Integer>, which is not a subtype of E<S,N>. However, E<S,I> is a subtype of E<? extends S,? extends N>. So our transformer should take that as input. For the output, no wild card, because the transformer can only instantiate a concrete type anyway. We just want to be honest and accurate of what can be consumed and what will be produced:

    /*     */ Transformation<
                  Entry<? extends String, ? extends Number>, 
                  Entry<Integer, String>
              > swap
        = new Transformation<
                  Entry<? extends String, ? extends Number>, 
                  Entry<Integer, String>> () 
    {
        public Entry<Integer, String> apply(
            Entry<? extends String, ? extends Number> sourceObject) 
        {
            return new Pair<Integer, String>(
                sourceObject.getValue().intValue(), 
                sourceObject.getKey()
            );
        }
    };

Note String is final and nobody extends it, but I'm afraid the generic system isn't that smart to know that, so as a matter of principle, I did ? extends String anyway, for later good.

Then, let's think about remapEntries(). We suspect that most transformers pass to it will have similar type declaration as the swap, because of the justifications we laid out. So we better have

remapEntry( 
    Transformation<
        Entry<? extends SK, ? extends SV>,
        Entry<RK,RV>
        > f,
    ...

to properly match that argument. From there, we work out the type of source and result, we want them to be as general as possible:

public static <SK, SV, RK, RV, RM extends Map<? super RK, ? super RV>>
RM remapEntries(
    Transformation<
        Entry<? extends SK, ? extends SV>,
        Entry<RK,RV>
        > f,
    Map<? extends SK, ? extends SV> source,
    RM result
)
{
    for(Entry<? extends SK, ? extends SV> entry : source.entrySet()) {
        Entry<RK,RV> res = f.apply(entry);
        result.put(res.getKey(), res.getValue());
    }
    return result;
}

RM isn't necessary, it's fine to use directly Map<? super RK, ? super RV>. But it seems that you want the return type identical to the result type in caller's context. I woulda simply made the return type void - there is enough trouble already.

This thing will fail, if swap does not use ? extends. For example if the input type is String-Integer, it's ridiculous to do ? extends of them. But you can have a overloading method with different parameter type declaration to match this case.

Ok, that worked, out of sheer luck. But, it is totally not worth it. Your life is much better if you just forget about it, and use raw type, document the parameters in English, do type check at runtime. Ask yourself, does the generic version buy you anything? Very little, at the huge price of rendering your code completely incomprehensible. Nobody, including yourself, and myself, could make sense of it if we read the method signature tomorrow morning. It is much much worse than regex.



回答3:

Something has just suddenly popped into my head: if the wildcards in nested generic parameters won't be captured as they are literally part of the type, then I could use the reverse bounds in the maps instead of using them in the Transformation.

public static <SK, SV, RK, RV, MapRes extends Map<? super RK, ? super RV>>
  MapRes remapEntries(final Transformation<Map.Entry<SK, SV>,
                                           Map.Entry<RK, RV>> f, 
                      final Map<? extends SK, ? extends SV> source,
                      MapRes result) {
    for (Map.Entry<? extends SK, ? extends SV> entry : source.entrySet()) {
        Map.Entry<? extends RK, ? extends RV> res = f.apply((Map.Entry<SK, SV>)entry);
        result.put(res.getKey(), res.getValue());
    }
    return result;
}

The only problem is that we have to do the unchecked cast in the Transformation.apply. It would be totally safe if the Map.Entry interface were read-only, so we can just cross fingers and hope that the transformation does not try to call Map.Entry.setValue.

We could still pass an immutable wrapper of the Map.Entry interface that threw an exception if the setValue method was called to ensure at least runtime type safety.

Or just make an explicit immutable Entry interface and use it, but that's a little bit like cheating (as having two different Transformations):

public interface ImmutableEntry<K, V> {
    public K getKey();
    public V getValue();
}

public static <SK, SV, RK, RV, RM extends Map<? super RK, ? super RV>> RM remapEntries(final Transformation<ImmutableEntry<SK, SV>, Map.Entry<RK, RV>> f,
        final Map<? extends SK, ? extends SV> source,
        RM result) {
    for (final Map.Entry<? extends SK, ? extends SV> entry : source.entrySet()) {
        Map.Entry<? extends RK, ? extends RV> res = f.apply(new ImmutableEntry<SK, SV>() {
            public SK getKey() {return entry.getKey();}
            public SV getValue() {return entry.getValue();}
        });
        result.put(res.getKey(), res.getValue());
    }
    return result;
}