I am using spark 1.6.1 version. I have requirement to execute dataframe in loop.
for ( i <- List ('a','b')){
val i = sqlContext.sql("SELECT i, col1, col2 FROM DF1")}
I want this dataframe to be executed twice (i = a
and i = b
).
I am using spark 1.6.1 version. I have requirement to execute dataframe in loop.
for ( i <- List ('a','b')){
val i = sqlContext.sql("SELECT i, col1, col2 FROM DF1")}
I want this dataframe to be executed twice (i = a
and i = b
).
Your code is almost correct. Except two things :
i
is already used in your for
loop so don't use it in val i =
i
in a string, use String InterpolationSo your code should look like :
for (i <- List ('a','b')) {
val df = sqlContext.sql(s"SELECT $i, col1, col2 FROM DF1")
df.show()
}
EDIT after author comment :
You can do this with a .map
and then a .reduceLeft
:
// All your dataframes
val dfs = Seq('a','b').map { i =>
sqlContext.sql(s"SELECT $i, col1, col2 FROM DF1")
}
// Then you can reduce your dataframes into one
val unionDF = dfs.reduceLeft((dfa, dfb) =>
dfa.unionAll(dfb)
)