Folks,
We have following data and we need following output.
CUSTOMER_NAME PRODUCT_NAME PRICE OCCURANCE ID
customer1, product1, 20, 1
customer1, product2, 30, 2
customer1, product1, 25, 3
customer1, product1, 20, 1
customer1, product2, 20, 2
customer1, product2, 30, 2
First we need to average the price by occurance id.
customer1,product1,20 (AVG is 20 for occurance 1), 1
customer1,product1,25 (AVG is 25 for occurance 3) , 3
Now once again we have to average it by customername,product name (Occurance is ignored in group by)
Final Output custoemr1,product1,avg price of all occurances.
customer1,product1, 20 + 25/2 = 22.5
Basically how to do average of average in HIVE ? We are not able to write anything for this.
Hi this can be achieved using nested queries as follows :
First step : to calculate initial averages of price by occurrence_id
Second Step : calculate the avg of avgs returned in first step