How to create custom list accumulator, i.e. List[(

2019-07-13 08:48发布

问题:

I am trying to use custom accumulator in Apache Spark to accumulate pairs in a list. The result should have List[(Int, Int)] type. For this I creat custom accumulator:

import org.apache.spark.AccumulatorParam

class AccumPairs extends AccumulatorParam[List[(Int,Int)]] {

    def zero(initialValue: List[(Int,Int)]): List[(Int,Int)] = {
      List()
    }

    def addInPlace(l1: List[(Int,Int)], l2: List[(Int,Int)]): List[(Int,Int)] = {
      l1 ++ l2
    }

 }

Yet I can not instantiate variable of this type.

val pairAccum = sc.accumulator(new List():List[(Int,Int)])(AccumPairs)

results in error. Please help.

回答1:

This one works:

val pairAccum = sc.accumulator(List[(Int,Int)]())( new AccumPairs)


回答2:

A class without parameters doesn't make much sense (if at all) as you "implicitly" create a single value anyway1. Change the keyword class to object and your example will work.

Change

class AccumPairs extends AccumulatorParam[List[(Int,Int)]] {

to

object AccumPairs extends AccumulatorParam[List[(Int,Int)]] {

[1] You still could create multiple instances of the class but they effectively be alike.