I have a Dataframe with an array of bytes in spark (python)
DF.select(DF.myfield).show(1, False)
+----------------+
|myfield |
+----------------+
|[00 8F 2B 9C 80]|
+----------------+
i'm trying to convert this array to a string
'008F2B9C80'
then to the numeric value
int('008F2B9C80',16)/1000000
> 2402.0
I have found some udf sample, so i already can extract a part of the array like this :
u = f.udf(lambda a: format(a[1],'x'))
DF.select(u(DF['myfield'])).show()
+------------------+
|<lambda>(myfield) |
+------------------+
| 8f|
+------------------+
Now how to iterate over the whole array ? Is it possible to do all the operations i have to code in the udf function ?
May be there is a best way to do the cast ???
Thanks for your help
Here is the scala df solution. You need to import the scala.math.BigInteger
There is no spark equivalent for scala's BigInteger, so I'm converting the udf() result to string.
I came across this question while answering your newest one.
Suppose you have the
df
asNow you can use the following lambda function
And to create the output, use
This yields,
I have found a python solution too
I'm now able to bench the two solutions
Thank you for your precious help