I have a table TabA in BigQuery which has a column ColA and the column ColA has the following structure
1038627|21514184
and the table TabA has more than a million records. I used this to split into multiple columns
SELECT ColA,FIRST(SPLIT(ColA, '/')) part1,
NTH(2, SPLIT(ColA, '/')) part2
FROM TabA
But for some reason, after certain rows the split seems to be not working properly.
We are getting records like this,
ColA part1 part2
1038627|21507470 1038627 21507470
1038627|21534857 1038627 21507470
1038627|21546455 1038627 21507470
1038627|21577167 1038627 21507470
It his happening on a random basis. Not sure where is there error.
SELECT COUNT(*) FROM TabA - returns say 1.7M records
SELECT ColA,FIRST(SPLIT(ColA, '|')) part1, NTH(2, SPLIT(ColA, '|')) part2 FROM TabA - returns 1.7M records with the wrong split
SELECT FIRST(SPLIT(ColA, '|')) part1, NTH(2, SPLIT(ColA, '|')) part2 FROM TabA - returns just 1.4L records with correct split
Don't know what exactly is happening...is it the problem with the data or the problem with the split ??
Any help would be greatly appreciated. Thanks in advance!!
To help in troubleshooting - I would recommend running same logic in BigQuery Standard SQL