Using 'case' in PairRDDFunctions.reduceByK

2019-01-18 16:08发布

问题:

This is the syntax for method reduceByKey

def reduceByKey(func: (V, V) ⇒ V): RDD[(K, V)] 

In a word count program I am practicing, I see this code,

val counts = words.map(word => (word, 1)).reduceByKey{case (x, y) => x + y}

The application works with (x, y) instead of case(x, y). What is the use of case here. I did also check the answer from @ghik here. but not able to understand

回答1:

Scala supports multiple ways of defining anonymous functions. The "case" version is so called Pattern Matching Anonymous Functions which is more or less equivalent to:

(x: Int, y: Int) => (x, y) match { case (x, y) => x + y }

while version without case is pretty much what it looks like:

(x: Int, y: Int) => x + y

In this case simple _ + _ would be enough though:

val counts = words.map(word => (word, 1)).reduceByKey(_ + _)

Probably the simplest case when you can benefit from using pattern matching is when you deal with Scala Options:

(x: Option[Int], y: Option[Int]) => (x, y) match {
    case (Some(xv), Some(yv)) => xv + yv
    case (Some(xv), _) => xv
    case (_, Some(yv)) => yv
    case _ => 0
}