Spark RDD mapping one row of data into multiple ro

2019-05-11 19:39发布

I have a text file with data that look like this:

Type1 1 3 5 9
Type2 4 6 7 8
Type3 3 6 9 10 11 25

I'd like to transform it into an RDD with rows like this:

1 Type1
3 Type1
3 Type3
......

I started with a case class:

MyData[uid : Int, gid : String]

New to spark and scala, and I can't seem to find an example that does this.

1条回答
欢心
2楼-- · 2019-05-11 20:21

It seems you want something like this?

rdd.flatMap(line=>{
  val splitLine = line.split(' ').toList
  splitLine match{
    case (gid:String) :: rest => rest.map(x:String =>MyData(x.toInt, gid))
  }
}
查看更多
登录 后发表回答