I am using Apache Calcite to implement a distributed OLAP system, which datasource is RDBMS. So I want to push down the project/filter/aggregation in RelNode
tree to MyTableScan extends TableScan
. In MyTableScan
, a RelBuilder
to get the pushed RelNode
. At last, RelBuilder
to generate the Query to the source database. At the same time, the project/filter/aggregation in original RelNode
tree should be moved or modified.
As I known, Calcite does not support this feature.
Current limitations: The JDBC adapter currently only pushes down table scan operations; all other processing (filtering, joins, aggregations and so forth) occurs within Calcite. Our goal is to push down as much processing as possible to the source system, translating syntax, data types and built-in functions as we go. If a Calcite query is based on tables from a single JDBC database, in principle the whole query should go to that database. If tables are from multiple JDBC sources, or a mixture of JDBC and non-JDBC, Calcite will use the most efficient distributed query approach that it can.
In my opinion, RelOptRule
may be a good choice. Unfortunately, when I create new RelOptRule
, I can not easily find the parent node to remove a node.
RelOptRule
is a good choice? Anyone has a good idea to implement this feature?
Thanks.
Creating a new
RelOptRule
is the way to go. Note that you shouldn't be trying directly remove any nodes inside a rule. Instead, you match a subtree that contains the nodes you want to replace (for example, aFilter
on top of aTableScan
). And then replace that entire subtree with an equivalent node which pushes down the filter.This is normally handled by creating a subclass of the relevant operation which conforms to the calling convention of the particular adapter. For example, in the Cassandra adapter, there is a
CassandraFilterRule
which matches aLogicalFilter
on top of aCassandraTableScan
. Theconvert
function then constructs aCassandraFilter
instance. TheCassandraFilter
instance sets up the necessary information so that when the query is actually issued, the filter is available.Browsing some of the code for the Cassandra, MongoDB, or Elasticsearch adapters may be helpful as they are on the simpler side. I would also suggest bringing this to the mailing list as you'll probably get more detailed advice there.
I have create some
RelOptRule
to push down the Project/Filter/Aggregate RelNode upper TableScan. Maybe helpful to others.RelOptRule
is used to define some Rules to match subtrees in whole RelNode. When match, call theonMatch
method to do something.In the
onMatch
method, we can create one new RelNode and call thetransformTo
method to replace the matched subtree.For example:
PushDownFilter rule as follows:
This rule will match the
Filter->TableScan
subtree, then call theonMatch
method. The method onlytransformTo
thetableScan
. The result is theFilter->TableScan
replaced byTableScan
, the whole RelNode as follows:Note that the
RelDataType
of new RelNode must be equal to the matched subtree。Calcite support some rules to use, for example
FilterJoinRule
,FilterTableScanRule
and so on.