I have two Hive UDFs in Java which work perfectly well in Hive.
Both functions are complimentary to each other.
String myUDF(BigInt)
BigInt myUDFReverso(String)
myUDF("myInput")
gives some output which
when myUDFReverso(myUDF("myInput"))
should give back myInput
This works in Hive but when I try to use
it in Impala (version 1.2.4) it gives expected
answer for myUDF(BigInt)
(the answer printed is correct)
but the answer when passed to myUDFReverso(String)
doesn't give
back original answer).
I have noticed that length(myUDF("myInput"))
in Impala 1.2.4
is wrong. It is +1 for every row. And again
it is correct in case of Hive and also Impala (version 2.1)
So, I assume there is some extra(special) character being appended
at the end of the output of myUDF
in Impala 1.2.4 (Precisely at the end
of the Text
datatype returned from the UDF function).
I have built a similar UDF for Impala 1.2.4 in Cpp and it works correctly.
All these issues are resolved in Impala 2.1 but I cannot upgrade my cluster to it.
So how do I work around this bug?
Reference: http://www.cloudera.com/content/cloudera/en/documentation/cloudera-impala/v1/v1-2-4/Installing-and-Using-Impala/ciiu_udf.html