Passing a list to Javascript UDF in Apache Pig

2019-09-02 08:55发布

问题:

If I have an array of stuff in Pig, like so:

datas   = load './data.txt' using PigStorage( '\t');
list    = load './frobdata.txt' using PigStorage();

And I want to pass these on to a UDF, like so:

register './enfrobinate.js' using javascript as frob;
frobbed = foreach datas generate flatten( frob.enfrobinate( list, $0 ) );

I cannot seem to find a prototype that works for passing a list to javascript, and the Pig documentation is not real clear on datatypes for Javascript UDFs.

I am aware of cross in Pig. This is not what I need (It gives me a cartesian product. Which is okay, except when we start getting very huge lists. The 'list' in this case is a few thousand items and the datas is many millions of items.)