How to remove NULL and empty for a particular colu

2020-08-03 12:57发布

问题:

I would like to remove records from a dataframe having demo_name as NULL and demo_name as empty.

demo_name is a column in that dataFrame with String datatype

I am trying the below code . I want to apply trim as there are records for demo_name with multiple spaces.

   val filterDF = demoDF.filter($"demo_name".isNotNull && $"demo_name".trim != "" )

But I get error as cannot resolve symbol trim

Could someone help me to fix this issue ?

回答1:

You are calling trim as if you are acting on a String, but the $ function uses implicit conversion to convert the name of the column to the Column instance itself. The problem is that Column doesn't have a trim function.

You need to import the library functions and apply them to your column:

import org.apache.spark.sql.functions._

demoDF.filter($"demo_name".isNotNull && length(trim($"demo_name")) > 0)

Here I use the library functions trim and length--trim to strip the spaces of course and then length to verify that the result has anything in it.