I have a data frame in pyspark with more than 100 columns. What I want to do is for all the column names I would like to add back ticks(`) at the start of the column name and end of column name.
For example:
column name is testing user. I want `testing user`
Is there a method to do this in pyspark/python. when we apply the code it should return a data frame.
Use list comprehension in python.
This method also gives you the option to add custom python logic within the alias() function like:
"prefix_"+c+"_suffix" if c in list_of_cols_to_change else c
You can use
withColumnRenamed
method of dataframe in combination withna
to create new dataframeedit : suppose you have list of columns, you can do like -
output :
If you would like to add a prefix or suffix to multiple columns in a pyspark dataframe, you could use a for loop and .withColumnRenamed().
As an example, you might like:
You can amend sdf.columns as you see fit.
I had a dataframe that I duplicated twice then joined together. Since both had the same columns names I used :
Every columns in my dataframe then had the '_prec' suffix which allowed me to do sweet stuff