I am trying to use the length function inside a substring function in a DataFrame
but it gives error
val substrDF = testDF.withColumn("newcol", substring($"col", 1, length($"col")-1))
below is the error
error: type mismatch;
found : org.apache.spark.sql.Column
required: Int
I am using 2.1.
If all you want is to remove the last character of the string, you can do that without UDF as well. By using
regexp_replace
:You could also use $"COLUMN".substr
Output:
You get that error because you the signature of
substring
isThe
len
argument that you are passing is aColumn
, and should be anInt
.You may probably want to implement a simple UDF to solve that problem.
Function "expr" can be used:
output: