how to make rdd tuple list in spark?

2019-09-09 05:54发布

问题:

I hope to create a RDD tuple list as following:

List(<1, a0>,<2, a0>, <3, a0>, ..., <100000, a0>)

the key of each tuple is 1 to 100000,

the value of each tuple is a constant number a0.

How can I achieve this? I just know

val list = sc.makeRDD(List(1 to 100000))

to create a list of number. but how to create tuple list as I mentioned above?

回答1:

To have Spark add the constant to create tuple:

val list = sc.makeRDD((1 to 100000)).map((_, a0))

To create tuples on driver machine before sending data to Spark:

val list = sc.makeRDD((1 to 100000).map((_, a0)))