Apache Phoenix + Pentaho Mondrian wrong join order

2019-06-08 17:27发布

问题:

I am using Apache Phoenix 4.5.2 from Cloudera labs distribution, which is installed over CDH 5.4 cluster. Now I'm trying to use it from Pentaho BA 5.4 server with embedded Mondrian and SAIKU Plugin installed.

I'm planning to use is as aggregator for Pentaho Mondrian ROLAP engine. So I have imported about 65 millions facts into fact table via slightly customized Pentaho Data Integration(if someone's interested, I added UPSERT to Table Output step, set Commit size to -1, set thin driver phoenix-<version>-query-server-thin-client.jar url to point to Apache Query Server and enabled its autocommit in hbase-site.xml via phoenix.connection.autoCommit), and now I have about 400 rows in time dimension table.

The problem is that Mondrian generates queries assuming that the order of table does not matter. It generates Cartesian join with FROM statement where dimension table come first and fact table comes last. If I change the order of tables, query works successfully.

This ends with Phoenix trying to cache 65 M rows table into memory, so I get org.apache.phoenix.join.MaxServerCacheSizeExceededException: Size of hash cache (104857626 bytes) exceeds the maximum allowed size (104857600 bytes).

Aside from building custom Mondrian which will place fact table first, is there any hint or index trick to force Phoenix iterate over facts table first, because for me it's no-brainer that it should iterate over 65M row table and hash join it with much smaller dimension table?

Exception stack trace:

Caused by: mondrian.olap.MondrianException: Mondrian Error:Internal error: Error while loading segment; sql=[select "DAYS"."DAY" as "c0", sum("account_transactions"."AMOUNT") as "m0" from "DAYS" as "DAYS", "account_transactions" as "account_transactions" where "account_transactions"."DATE" = "DAYS"."DATE" group by "DAYS"."DAY"]
        at mondrian.resource.MondrianResource$_Def0.ex(MondrianResource.java:972)
        at mondrian.olap.Util.newInternal(Util.java:2404)
        at mondrian.olap.Util.newError(Util.java:2420)
        at mondrian.rolap.SqlStatement.handle(SqlStatement.java:353)
        at mondrian.rolap.SqlStatement.execute(SqlStatement.java:253)
        at mondrian.rolap.RolapUtil.executeQuery(RolapUtil.java:350)
        at mondrian.rolap.agg.SegmentLoader.createExecuteSql(SegmentLoader.java:625)
        ... 8 more
Caused by: java.sql.SQLException: Encountered exception in sub plan [0] execution.
        at org.apache.phoenix.execute.HashJoinPlan.iterator(HashJoinPlan.java:171)
        at org.apache.phoenix.execute.HashJoinPlan.iterator(HashJoinPlan.java:121)
        at org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:266)
        at org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:256)
        at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
        at org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:255)
        at org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:1409)
        at org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208)
        at org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208)
        at mondrian.rolap.SqlStatement.execute(SqlStatement.java:200)
        ... 10 more
Caused by: org.apache.phoenix.join.MaxServerCacheSizeExceededException: Size of hash cache (104857626 bytes) exceeds the maximum allowed size (104857600 bytes)
        at org.apache.phoenix.join.HashCacheClient.serialize(HashCacheClient.java:109)
        at org.apache.phoenix.join.HashCacheClient.addHashCache(HashCacheClient.java:82)
        at org.apache.phoenix.execute.HashJoinPlan$HashSubPlan.execute(HashJoinPlan.java:353)
        at org.apache.phoenix.execute.HashJoinPlan$1.call(HashJoinPlan.java:145)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at org.apache.phoenix.job.JobManager$InstrumentedJobFutureTask.run(JobManager.java:183)
        ... 3 more

回答1:

Hash Join vs. Sort-Merge Join

Basic hash join usually outperforms other types of join algorithms, but it has its limitations too, the most significant of which is the assumption that one of the relations must be small enough to fit into memory. Thus Phoenix now has both hash join and sort-merge join implemented to facilitate fast join operations as well as join between two large tables.

Phoenix currently uses the hash join algorithm whenever possible since it is usually much faster. However we have the hint “USE_SORT_MERGE_JOIN” for forcing the usage of sort-merge join in a query. The choice between these two join algorithms, together with detecting the smaller relation for hash join, will be done automatically in future under the guidance provided by table statistics.

You can add the USE_SORT_MERGE_JOIN hint in the query so that Phoenix does not try to fit the relation in memory.

ie. SELECT /*+ USE_SORT_MERGE_JOIN*/ ...


Alternatively, you can configure a larger max cache size if you are confident that your relation will fit in memory.

https://phoenix.apache.org/tuning.html

phoenix.query.maxServerCacheBytes Default 100MB. 104857600

Maximum size (in bytes) of a single sub-query result (usually the filtered result of a table) before compression and conversion to a hash map. Attempting to hash an intermediate sub-query result of a size bigger than this setting will result in a MaxServerCacheSizeExceededException.