How to create a dataframe from two others datafram

2019-09-26 08:16发布

问题:

I have these two dataframe objects, with a single column each:

a = predictons_lr.select('prediction')
b = predictions_nb.select('prediction')

I would like to create a single resulting dataframe having a and b as columns. I have tried:

df_result = spark.createDataFrame([a, b])

but I get this error:

AssertionError: dataType py4j.java_gateway.JavaMember object at 0x000002260F3D4240 should be an instance of class 'pyspark.sql.types.DataType'

There is an efficient method to create a dataframe of this kind?

回答1:

If this two column are same data type , you can just union

a = predictons_lr.select('prediction')
b = predictions_nb.select('prediction')

new_df = a.union(b)