I'm trying to use JOOQ for quering Hive.
Hive SQL dialect is pretty clode to MySQL dialect.
Right now I've met these problems:
- Hive supports LIMIT N, it doesn't support LIMIT N OFFSET K. Dummy
solution - override select.limit(limit);
What are best practices resolving such problems in JOOQ?
Unfortunately, extending jOOQ to thoroughly support a new SQL dialect isn't very straightforward. jOOQ's API has grown extensive over time, supporting a great set of standard and vendor-specific SQL syntax variants. While the Apache Hive dialect may appear similar to MySQL, there are probably lots of subtle differences that would need to be implemented in jOOQ's internals. The different implementation of the LIMIT .. OFFSET clause just being one issue. That said, it is generally not a good idea to use jOOQ with an "unknown" or "unsupported" dialect.
Solution: In the short run
In the short run, you will probably have to patch jOOQ's rendered SQL. The best technique for this is to use an ExecuteListener as documented here:
- http://www.jooq.org/doc/latest/manual/sql-execution/execute-listeners/
- http://www.jooq.org/javadoc/latest/org/jooq/ExecuteListener.html
Upon receiving a "renderEnd()" event, you will be able to access the rendered SQL and modify it using regular expressions or whatever technique you may prefer.
Solution: In the long run
In the long run, there may be a better solution if / when #2337 is implemented (but we probably won't implement that)
Here is the dirtiest solution :) JOOQ user group didn't answer unfortunately :(
public class CountRatingQueryBuilder {
private static final String SCORING_TABLE_NAME = "web_resource_rating";
private final Connection connection;
private final ScoringMetadata scoringMetadata;
private final SelectSelectStep select;
private final Factory create;
public CountRatingQueryBuilder(Connection connection, ScoringMetadata scoringMetadata){
this.connection = connection;
this.scoringMetadata = scoringMetadata;
create = new Factory(this.connection, SQLDialect.MYSQL);
select = create.select();
withSelectFieldsClause();
}
public CountRatingQueryBuilder withLimit(int limit){
select.limit(limit);
return this;
}
public CountRatingQueryBuilder withRegionId(Integer regionId){
select.where(REGION_ID.field().equal(regionId));
return this;
}
public CountRatingQueryBuilder withResourceTypeId(int resourceTypeId){
select.where(RESOURCE_TYPE_ID.field().equal(resourceTypeId));
return this;
}
public CountRatingQueryBuilder withRequestTimeBetween(long beginTimestamp, long endTimestamp){
select.where(REQUEST_TIME.field().between(beginTimestamp, endTimestamp));
return this;
}
public CountRatingQueryBuilder withResourceId(int resourceId){
select.where(RESOURCE_ID.field().equal(resourceId));
return this;
}
protected void withGroupByClause(){
select.groupBy(REGION_ID.field());
select.groupBy(RESOURCE_TYPE_ID.field());
select.groupBy(RESOURCE_ID.field());
select.groupBy(CONTENT_ID.field());
}
protected void withSelectFieldsClause(){
select.select(REGION_ID.field());
select.select(RESOURCE_TYPE_ID.field());
select.select(CONTENT_ID.field());
select.select(RESOURCE_ID.field());
select.select(Factory.count(HIT_COUNT.field()).as(SUM_HIT_COUNT.fieldName()));
}
protected void withFromClause(){
select.from(SCORING_TABLE_NAME);
}
protected void withOrderByClause(){
select.orderBy(SUM_HIT_COUNT.field().desc());
}
public String build(){
withGroupByClause();
withOrderByClause();
withFromClause();
return select.getSQL().replace("offset ?","");//dirty hack for MySQL dialect. TODO: we can try to implement our own SQL dialect for Hive :)
}
public List<ResultRow> buildAndFetch(){
String sqlWithPlaceholders = build();
List<ResultRow> scoringResults = new ArrayList<ResultRow>(100);
List<Record> recordResults = create.fetch(sqlWithPlaceholders, ArrayUtils.subarray(select.getBindValues().toArray(new Object[select.getBindValues().size()]),0, select.getBindValues().size()-1));//select.fetch();
for(Record record : recordResults){
ResultRowBuilder resultRowBuilder = ResultRowBuilder.create();
resultRowBuilder.withContentType(scoringMetadata.getResourceType(record.getValue(RESOURCE_TYPE_ID.fieldName(), Integer.class)));
resultRowBuilder.withHitCount(record.getValue(SUM_HIT_COUNT.fieldName(), Long.class));
resultRowBuilder.withUrl(record.getValue(CONTENT_ID.fieldName(), String.class));
scoringResults.add(resultRowBuilder.build());
}
return scoringResults;
}
}