Convert a simple one line string to RDD in Spark

2020-05-30 04:11发布

I have a simple line:

line = "Hello, world"

I would like to convert it to an RDD with only one element. I have tried

sc.parallelize(line)

But it get:

sc.parallelize(line).collect()
['H', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd']

Any ideas?

2条回答
▲ chillily
2楼-- · 2020-05-30 04:22

try using List as parameter:

sc.parallelize(List(line)).collect()

it returns

res1: Array[String] = Array(hello,world)
查看更多
萌系小妹纸
3楼-- · 2020-05-30 04:26

The below code works fine in Python

sc.parallelize([line]).collect()

['Hello, world']

Here we are passing the parameter "line" as a list.

查看更多
登录 后发表回答