OOM in tez/hive

2019-08-17 08:55发布

问题:

[After a few answers and comments I asked a new question based on the knowledge gained here: Out of memory in Hive/tez with LATERAL VIEW json_tuple ]

One of my query consistently fails with the error:

ERROR : Status: Failed
ERROR : Vertex failed, vertexName=Map 1, vertexId=vertex_1516602562532_3606_2_03, diagnostics=[Task failed, taskId=task_1516602562532_3606_2_03_000001, diagnostics=[TaskAttempt 0 failed, info=[Container container_e113_1516602562532_3606_01_000008 finished with diagnostics set to [Container failed, exitCode=255. Exception from container-launch.
Container id: container_e113_1516602562532_3606_01_000008
Exit code: 255
Stack trace: ExitCodeException exitCode=255: 
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:933)
    at org.apache.hadoop.util.Shell.run(Shell.java:844)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1123)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:237)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:317)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:83)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

Container exited with a non-zero exit code 255
]], TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)

The keyword here seems to be java.lang.OutOfMemoryError: Java heap space.

I looked around but none of what I thought I understood from Tez helps me:

  • ​yarn-site/yarn.nodemanager.resource.memory-mb is maxed up => I use all the memory I can
  • yarn-site/yarn.scheduler.maximum-allocation-mb: same as yarn.nodemanager.resource.memory-mb
  • yarn-site/yarn.scheduler.minimum-allocation-mb = 1024
  • hive-site/hive.tez.container.size = 4096 (multiple of yarn.scheduler.minimum-allocation-mb)

​My query has 4 mappers, 3 go very fast, the 4th dies everytime. Here is the Tez graphical view of the query:

From this image:

  • table contact has 150M rows, 283GB of ORC compressed data (there is one large json field, LATERAL VIEW'ed)
  • table m has 1M rows, 20MB of ORC compressed data
  • table c has 2k rows, < 1MB ORC compressed
  • table e has 800k rows, 7GB of ORC compressed
  • e is LEFT JOIN'ed with all the other tables

e and contact are partitioned and only one partition in selected in the WHERE clause.

I thus tried to increase the number of maps:

  • tez.grouping.max-size: 650MB by default, even if I lower it to - tez.grouping.min-size​ (16MB) it makes no difference
  • tez.grouping.split-count even increased to 1000 does not make a difference
  • tez.grouping.split-wave 1.7 by default, even increased to 5 makes no difference

If it's relevant, here are some other memory settings:

  • mapred-site/mapreduce.map.memory.mb = 1024 (Min container size)
  • mapred-site/mapreduce.reduce.memory.mb = 2048 (2 * min container size)
  • mapred-site/mapreduce.map.java.opts = 819 (0.8 * min container size)
  • mapred-site/mapreduce.reduce.java.opts = 1638 (0.8 * mapreduce.reduce.memory.mb)
  • mapred-site/yarn.app.mapreduce.am.resource.mb = 2048 (2 * min container size)
  • mapred-site/yarn.app.mapreduce.am.command-opts = 1638 (0.8 * yarn.app.mapreduce.am.resource.mb)
  • mapred-site/mapreduce.task.io.sort.mb = 409 (0.4 * min container size)

My understanding was that tez can split the work in many loads, thus taking long but eventually completing. ​Am I wrong, or is there a way I have not found?

context: hdp2.6, 8 datanodes with 32GB Ram, query using a chunky lateral view based on json run via beeline.

回答1:

The issue is clearly due to SKEWED data. I would recommand that you add DISTRIBUTE BY COL to you select query from source so that the reducer has evenly distributed data. In the below example COL3 is more evenly distributed data like ID column Example

ORIGINAL QUERY : insert overwrite table X AS SELECT COL1,COL2,COL3 from Y
NEW QUERY      : insert overwrite table X AS SELECT COL1,COL2,COL3 from Y distribute by COL3