C# split a list into all combinations of n groups

2019-07-18 04:05发布

问题:

There is a great implementation of the algorithm I am after here (by @lazy dog). However, I need this in c# and the conversion is not trivial due to C#'s lack of yield from and perhaps my own thickheadedness.

Here is what I currently have:

public static IEnumerable<ArrayList> sorted_k_partitions(int[] seq, int k) {
  var n = seq.Length;
  var groups = new ArrayList(); //a list of lists, currently empty

  IEnumerable<ArrayList> generate_partitions(int i) {
    if (i >= n) {
      // this line was the bug, was not creating a 
      //   deep clone of the list of lists
      // yield return new ArrayList(groups); 
      yield return new ArrayList(groups.ToArray().Select(g => ((List<int>)g).ToList())); 
      // Ugly but that is because we are using ArrayList
      // Using proper List<List<int>> cleans this up significantly
    }
    else {
      if (n - i > k - groups.Count)
        foreach (List<int> group in new ArrayList(groups)) {
          group.Add(seq[i]);
          // yield from generate_partitions(i + 1);
          foreach (var d in generate_partitions(i + 1)) {
            yield return d;
          }
          group.RemoveAt(group.Count - 1);
        }

      if (groups.Count < k) {
        groups.Add(new List<int> {seq[i]});
        foreach (var d in generate_partitions(i + 1)) {
          // things start breaking down here, as this yield return
          //  appears to release flow control and we then get the 
          //  yield return above.  I have debuged this and the python
          //  version and the python version does not do this.  Very hard
          //  to explain considering I don't fully understand it myself
          yield return d;
        }
        groups.RemoveAt(groups.Count - 1);
      }
    }
  }
  return generate_partitions(0);
  // don't worry about the sorting methods in the python 
  //   version, not needed
}

Can anyone see any obvious mistakes, I'm sure my lack of understanding of Python's yield from and coroutines is hurting me here.

Edit: Found the bug, added comments above

回答1:

Ok so a good working solution I have come up with is here:

public static IEnumerable<List<List<int>>> CreatePartitions(int[] seq, int k) {
  var n = seq.Length;
  var groups = new List<List<int>>(); 

  IEnumerable<List<List<int>>> generate_partitions(int i) {
    if (i >= n) {
      yield return new List<List<int>>(groups.Select(g => g.ToList()));
    }
    else {
      if (n - i > k - groups.Count)
        foreach (var group in new List<List<int>>(groups)) {
          group.Add(seq[i]);
          foreach (var d in generate_partitions(i + 1)) {
            yield return d;
          }
          group.RemoveAt(group.Count - 1);
        }

      if (groups.Count < k) {
        groups.Add(new List<int> {seq[i]});
        foreach (var d in generate_partitions(i + 1)) {
          yield return d;
        }
        groups.RemoveAt(groups.Count - 1);
      }
    }
  }
  return generate_partitions(0);
}

This is a bit faster than python as you would expect but still not great. I experimented with parallelisation but did not get too far. I also experimented by trying to remove some of the object creations and using Array.Copy instead. The mess that created was not worth the negligible performance improvements. I guess this is just slow because as numbers get bigger (say seq of 15-20 items) then the number of combinations is just enormous and no optimisations can help to make that into a more tractable problem.



回答2:

What behaviour do you get here?

In my opinion yield return generate_partitions(i + 1); instead of the foreach loop should work fine. It will just call the function recursive with the new value i+1