I am using hive and a python udf. I defined a sql file in which I added the python udf and I call it. So far so good and I can process on my query results using my python function. However, at this point of time, I have to use an external .txt file in my python udf. I uploaded that file into my cluster (the same directory as .sql and .py file) and I also added that in my .sql file using this command:
ADD FILE /home/ra/stopWords.txt;
When I call this file in my python udf as this:
file = open("/home/ra/stopWords.txt", "r")
I got several errors. I cannot figure out how to add nested files and using them in hive.
any idea?