I am getting this error but i do not know why. Basically I am erroring from this code:
a = data.mapPartitions(helper(locations))
where data is an RDD and my helper is defined as:
def helper(iterator, locations):
for x in iterator:
c = locations[x]
yield c
(locations is just an array of data points) I do not see what the problem is but I am also not the best at pyspark so can someone please tell me why I am getting 'PipelinedRDD' object is not iterable from this code?
RDD can iterated by using map and lambda functions. I have iterated through Pipelined RDD using the below method
Instead of this I replaced the iteration using map function
Hope this helps to modify your code accordingly