I have following tuple H1 and I want to strsplit its $0 into tuple.However I always get an error message:
DUMP H1:
(item32;item31;,1)
m = FOREACH H1 GENERATE STRSPLIT($0, ";", 50);
ERROR 1000: Error during parsing. Lexical error at line 1, column 40. Encountered: after : "\";"
Anyone knows what's wrong with the script?
There is an escaping problem in the pig parsing routines when it encounters this semicolon.
You can use a unicode escape sequence for a semicolon:
\u003B
. However this must also be slash escaped and put in a single quoted string. Alternatively, you can rewrite the command over multiple lines, as per Neil's answer. In all cases, this must be a single quoted string.STRSPLIT on a semi-colon is tricky. I got it to work by putting it inside of a block.
Funny enough, this is how I originally implemented my STRSPLIT() command. Only after trying to get it to split on a semicolon did I run into the same issue.