I am using spark application. In each element of rdd contains good amount of data. I want to save each element of rdd into multiple hdfs files respectively. I tried rdd.saveAsTextFile("foo.txt")
But I will create a single file for whole rdd
. rdd
size is 10. I want 10 files in hdfs
. How can I achieve this??
相关问题
- How to maintain order of key-value in DataFrame sa
- Spark on Yarn Container Failure
- In Spark Streaming how to process old data and del
- Filter from Cassandra table by RDD values
- Spark 2.1 cannot write Vector field on CSV
相关文章
- Java写文件至HDFS失败
- Livy Server: return a dataframe as JSON?
- SQL query Frequency Distribution matrix for produc
- How to filter rows for a specific aggregate with s
- How to name file when saveAsTextFile in spark?
- Spark save(write) parquet only one file
- Could you give me any clue Why 'Cannot call me
- Why does the Spark DataFrame conversion to RDD req
If I understand your question, you can create a custom output format like this
Then convert your RDD into a key/val one where the key is the file path, and you can use saveAsHadoopFile function insted of saveAsTextFile, like this: