I am trying to do count comparision between two table . As minus operator does not work in hive , it is not happening. Could you please give some simple way to do count comparision between two tables.
select 'Call Detail - Hive T1 to HDFS Staging - Data Compare',
case when cnt>0 then 'Fail' Else 'Pass' end
from
(select count(*) cnt from (
(select
count(*) from students1 s1)-
(select count(*) from students2 s2)
) as tbl1
) as tbl2;
It's throwing error:
FAILED: ParseException line 81:0 cannot recognize input near '(' '(' 'select' in from source
Use cross join
if you have no group by columns. In this case it will produce one row with both counts:
select s.cnt-s1.cnt diff, case when abs(s.cnt-s1.cnt) > 0 then 'Fail' Else 'Pass' end result
from
(select count(*) cnt from students1 s1) s
cross join
(select count(*) cnt from students2 s2) s1
If you will add some group by columns to compare on more detailed grain, then use FULL JOIN
on group by columns:
select s.col1 s_col1, s1.col1 s1_col1, s.cnt-s1.cnt diff, case when abs(s.cnt-s1.cnt) > 0 then 'Fail' Else 'Pass' end result
from
(select count(*) cnt, col1 from students1 s1 group by col1) s
full join
(select count(*) cnt, col1 from students2 s2 group by col1) s1
on s.col1 = s1.col1
This query will return joined rows with difference calculated and also not joined rows from both tables.
check out below query .. it's working fine locally at my system.
Let me know if it helps.
select 'Call Detail - Hive T1 to HDFS Staging - Data Compare',
case
when (sum(cnt1) - sum(cnt2)) > 0
then 'PASS'
else 'FAIL'
end as count_records
from (select count(*) as cnt1, 0 as cnt2 from students1
union all
select 0 as cnt1, count(*) as cnt2 from students1 ) tbl;