Hive command line Select query time taken incorrec

2019-07-09 11:59发布

问题:

I am running hive query as below

Select count(*),group_name from table_name group by group_name;

Status: Running (Executing on YARN cluster with App id XXXX)

--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
Map 1 ..........   SUCCEEDED     54         54        0        0       0       0
Reducer 2 ......   SUCCEEDED     13         13        0        0       0       0
--------------------------------------------------------------------------------
VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 24.93 s
--------------------------------------------------------------------------------
OK
Result
Time taken: 26.786 seconds, Fetched: 10 row(s)

The above timings look accurate when there is map reduce involved. But when I am running a simple query as below

select group_name from table_name

Time taken: 0.771 seconds, Fetched: 14 row(s)

The time above is not correct.

Also any idea how to measure query time more accurately will be greatly appreciated.

Thanks in advance

回答1:

Measure time from shell script. There is time command.

Call your hive command like this:

time hive -e 'select group_name from table_name;'

time command outputs three times: real, user and sys

real        0m0.007s
user        0m0.000s
sys         0m0.005s 

Real is what you probably want to know. Real is wall clock time - time from start to finish of the call. This is all elapsed time including time slices used by other processes and time the process spends blocked (for example if it is waiting for I/O to complete).

See also this question: How do I get just real time value from 'time' command?



标签: hive