Cassandra CQL3 composite key not written by Hadoop

2020-07-29 23:25发布

I'm using Cassandra 1.2.8, and have several Hadoop MapReduce jobs, that read rows from some CQL3 tables and write result back to another CQL3 tables.

If output CQL3 tables contain composite key, values of the composite key fields are not written by reducer - instead I see empty values for those fields, while performing select query in cqlsh. If the primary key is not composite, everything works correctly.

Example of the output CQL3 table with composite key:

CREATE TABLE events_by_type_with_source (
    event_type_id ASCII,
    period ASCII,
    date TIMESTAMP,
    source_name ASCII,
    events_number COUNTER,
    PRIMARY KEY((event_type_id, period), date, source_name)
);

My output query is: UPDATE events_by_type_with_source SET events_number = events_number + ?

My Reducer function looks like this:

public void reduce(BytesWritable key, Iterable<BytesWritable> values, Context context) {
     ...
    private final Map<String, ByteBuffer> keys = new HashMap<>();
    ...
    keys.put(COLUMN_EVENT_TYPE_ID, eventTypeIdByteBuffer);
    keys.put(COLUMN_SOURCE_NAME, sourceNameByteBuffer);
    keys.put(COLUMN_DATE, dateByteBuffer);
    keys.put(COLUMN_PERIOD, periodByteBuffer);
    ...
    context.write(keys, Arrays.asList(countByteBuffer));

}

The result is:

cqlsh:keyspace1> select * from dd_events_by_type_with_source ;

 event_type_id | period | date                     | source_name | events_number
---------------+--------+--------------------------+-------------+---------------
               |        | 2013-08-01 00:00:00+0000 |           A |            24
               |        | 2013-08-26 00:00:00+0000 |           A |            24
               |        | 2013-08-27 00:00:00+0000 |           A |            24
               |        | 2013-08-27 08:00:00+0000 |           A |            24

As you can see, event_type_id and period fields are empty, even though I put not-empty valid ASCII strings in the reducer.

Any idea how to fix this?

1条回答
再贱就再见
2楼-- · 2020-07-30 00:23

This is a known issue in pre-1.2.10 Cassandra: https://issues.apache.org/jira/browse/CASSANDRA-5949

Based on previous release schedule, I would expect 1.2.10 to be available near the end of September 2013. This issue does not appear to exist in Cassandra 2.0.

查看更多
登录 后发表回答