Nifty way to iterate over parallel arrays in Java

2020-02-07 17:42发布

I've inherited a bunch of code that makes extensive use of parallel arrays to store key/value pairs. It actually made sense to do it this way, but it's sort of awkward to write loops that iterate over these values. I really like the new Java foreach construct, but it does not seem like there is a way to iterate over parallel lists using this.

With a normal for loop, I can do this easily:

for (int i = 0; i < list1.length; ++i) {
    doStuff(list1[i]);
    doStuff(list2[i]);
}

But in my opinion this is not semantically pure, since we are not checking the bounds of list2 during iteration. Is there some clever syntax similar to the for-each that I can use with parallel lists?

8条回答
Anthone
2楼-- · 2020-02-07 18:02
//Do you think I'm sexy?
if(list1.length == list2.length){
    for (int i = 0; i < list1.length; ++i) {
        doStuff(list1[i]);
        doStuff(list2[i]);
    }
}
查看更多
我只想做你的唯一
3楼-- · 2020-02-07 18:06

From the official Oracle page on the enhanced for loop:

Finally, it is not usable for loops that must iterate over multiple collections in parallel. These shortcomings were known by the designers, who made a conscious decision to go with a clean, simple construct that would cover the great majority of cases.

Basically, you're best off using the normal for loop.

If you're using these pairs of arrays to simulate a Map, you could always write a class that implements the Map interface with the two arrays; this could let you abstract away much of the looping.

Without looking at your code, I cannot tell you whether this option is the best way forward, but it is something you could consider.

查看更多
男人必须洒脱
4楼-- · 2020-02-07 18:08

With Java 8, I use these to loop in the sexy way:

//parallel loop
public static <A, B> void loop(Collection<A> a, Collection<B> b, IntPredicate intPredicate, BiConsumer<A, B> biConsumer) {
    Iterator<A> ait = a.iterator();
    Iterator<B> bit = b.iterator();
    if (ait.hasNext() && bit.hasNext()) {
        for (int i = 0; intPredicate.test(i); i++) {
            if (!ait.hasNext()) {
                ait = a.iterator();
            }
            if (!bit.hasNext()) {
                bit = b.iterator();
            }
            biConsumer.accept(ait.next(), bit.next());
        }
    }
}

//nest loop
public static <A, B> void loopNest(Collection<A> a, Collection<B> b, BiConsumer<A, B> biConsumer) {
    for (A ai : a) {
        for (B bi : b) {
            biConsumer.accept(ai, bi);
        }
    }
}

Some example, with these 2 lists:

List<Integer> a = Arrays.asList(1, 2, 3);
List<String> b = Arrays.asList("a", "b", "c", "d");

Loop within min size of a and b:

loop(a, b, i -> i < Math.min(a.size(), b.size()), (x, y) -> {
    System.out.println(x +  " -> " + y);
});

Output:

1 -> a
2 -> b
3 -> c

Loop within max size of a and b (elements in shorter list will be cycled):

loop(a, b, i -> i < Math.max(a.size(), b.size()), (x, y) -> {
    System.out.println(x +  " -> " + y);
});

Output:

1 -> a
2 -> b
3 -> c
1 -> d

Loop n times ((elements will be cycled if n is bigger than sizes of lists)):

loop(a, b, i -> i < 5, (x, y) -> {
    System.out.println(x +  " -> " + y);
});

Output:

1 -> a
2 -> b
3 -> c
1 -> d
2 -> a

Loop forever:

loop(a, b, i -> true, (x, y) -> {
    System.out.println(x +  " -> " + y);
});

Apply to your situation:

loop(list1, list2, i -> i < Math.min(a.size(), b.size()), (e1, e2) -> {
    doStuff(e1);
    doStuff(e2);
});
查看更多
我想做一个坏孩纸
5楼-- · 2020-02-07 18:12

This was a fun exercise. I created an object called ParallelList that takes a variable number of typed lists, and can iterate over the values at each index (returned as a list of values):

public class ParallelList<T> implements Iterable<List<T>> {

    private final List<List<T>> lists;

    public ParallelList(List<T>... lists) {
        this.lists = new ArrayList<List<T>>(lists.length);
        this.lists.addAll(Arrays.asList(lists));
    }

    public Iterator<List<T>> iterator() {
        return new Iterator<List<T>>() {
            private int loc = 0;

            public boolean hasNext() {
                boolean hasNext = false;
                for (List<T> list : lists) {
                    hasNext |= (loc < list.size());
                }
                return hasNext;
            }

            public List<T> next() {
                List<T> vals = new ArrayList<T>(lists.size());
                for (int i=0; i<lists.size(); i++) {
                    vals.add(loc < lists.get(i).size() ? lists.get(i).get(loc) : null);
                }
                loc++;
                return vals;
            }

            public void remove() {
                for (List<T> list : lists) {
                    if (loc < list.size()) {
                        list.remove(loc);
                    }
                }
            }
        };
    }
}

Example usage:

List<Integer> list1 = Arrays.asList(new Integer[] {1, 2, 3, 4, 5});
List<Integer> list2 = Arrays.asList(new Integer[] {6, 7, 8});
ParallelList<Integer> list = new ParallelList<Integer>(list1, list2);
for (List<Integer> ints : list) {
    System.out.println(String.format("%s, %s", ints.get(0), ints.get(1)));
}

Which would print out:

1, 6
2, 7
3, 8
4, null
5, null

This object supports lists of variable lengths, but clearly it could be modified to be more strict.

Unfortunately I couldn't get rid of one compiler warning on the ParallelList constructor: A generic array of List<Integer> is created for varargs parameters, so if anyone knows how to get rid of that, let me know :)

查看更多
可以哭但决不认输i
6楼-- · 2020-02-07 18:12

ArrayIterator lets you avoid indexing, but you can’t use a for-each loop without writing a separate class or at least function. As @Alexei Blue remarks, official recommendation (at The Collection Interface) is: “Use Iterator instead of the for-each construct when you need to: … Iterate over multiple collections in parallel.”:

import static com.google.common.base.Preconditions.checkArgument;
import org.apache.commons.collections.iterators.ArrayIterator;

// …

  checkArgument(array1.length == array2.length);
  Iterator it1 = ArrayIterator(array1);
  Iterator it2 = ArrayIterator(array2);
  while (it1.hasNext()) {
      doStuff(it1.next());
      doOtherStuff(it2.next());
  }

However:

  • Indexing is natural for arrays – an array is by definition something you index, and a numerical for loop, as in your original code, is perfectly natural and more direct.
  • Key-value pairs naturally form a Map, as @Isaac Truett remarks, so cleanest would be to create maps for all your parallel arrays (so this loop would only be in the factory function that creates the maps), though this would be inefficient if you just want to iterate over them. (Use Multimap if you need to support duplicates.)
  • If you have a lot of these, you could (partially) implement ParallelArrayMap<> (i.e., a map backed by parallel arrays), or maybe ParallelArrayHashMap<> (to add a HashMap if you want efficient lookup by key), and use that, which allows iteration in the original order. This is probably overkill though, but allows a sexy answer.

That is:

Map<T, U> map = new ParallelArrayMap<>(array1, array2);
for (Map.Entry<T, U> entry : map.entrySet()) {
  doStuff(entry.getKey());
  doOtherStuff(entry.getValue());
}

Philosophically, Java style is to have explicit, named types, implemented by classes. So when you say “[I have] parallel arrays [that] store key/value pairs.”, Java replies “Write a ParallelArrayMap class that implements Map (key/value pairs) and that has a constructor that takes parallel arrays, and then you can use entrySet to return a Set that you can iterate over, since Set implements Collection.” – make the structure explicit in a type, implemented by a class.

For iterating over two parallel collections or arrays, you want to iterate over a Iterable<Pair<T, U>>, which less explicit languages allow you to create with zip (which @Isaac Truett calls wrap). This is not idiomatic Java, however – what are the elements of the pair? See Java: How to write a zip function? What should be the return type? for an extensive discussion of how to write this in Java and why it’s discouraged.

This is exactly the stylistic tradeoff Java makes: you know exactly what type everything is, and you have to specify and implement it.

查看更多
劫难
7楼-- · 2020-02-07 18:18

You can use a second constraint in your for loop:

    for (int i = 0; i < list1.length && i < list2.length; ++i) 
    {
      doStuff(list1[i]);
      doStuff(list2[i]);
    }//for

One of my preferred methods for traversing collections is the for-each loop, but as the oracle tutorial mentions, when dealing with parallel collections to use the iterator rather than the for-each.

The following was an answer by Martin v. Löwis in a similar post:

it1 = list1.iterator();
it2 = list2.iterator();
while(it1.hasNext() && it2.hasNext()) 
{
   value1 = it1.next();
   value2 = it2.next();

   doStuff(value1);
   doStuff(value2);
}//while

The advantage to the iterator is that it's generic so if you don't know what the collections are being used, use the iterator, otherwise if you know what your collections are then you know the length/size functions and so the regular for-loop with the additional constraint can be used here. (Note I'm being very plural in this post as an interesting possibility would be where the collections used are different e.g. one could be a List and the other an array for instance)

Hope this helped.

查看更多
登录 后发表回答