val columnName=Seq("col1","col2",....."coln");
Is there a way to do dataframe.select operation to get dataframe containing only the column names specified .
I know I can do dataframe.select("col1","col2"...)
but the columnName
is generated at runtime.
I could do dataframe.select()
repeatedly for each column name in a loop.Will it have any performance overheads?. Is there any other simpler way to accomplish this?
val columnNames = Seq("col1","col2",....."coln")
// using the string column names:
val result = dataframe.select(columnNames.head, columnNames.tail: _*)
// or, equivalently, using Column objects:
val result = dataframe.select(columnNames.map(c => col(c)): _*)
Since dataFrame.select()
expects a sequence of columns and we have a sequence of strings, we need to convert our sequence to a List
of col
s and convert that list to the sequence. columnName.map(name => col(name)): _*
gives a sequence of columns from a sequence of strings, and this can be passed as a parameter to select()
:
val columnName = Seq("col1", "col2")
val DFFiltered = DF.select(columnName.map(name => col(name)): _*)