Pentaho的帧大小(17727647)比最大长度(16384000)更大!(Pentaho Fr

2019-10-21 03:03发布

在Pentaho的,当我运行一个卡桑德拉输入工序,获得约5万行,我得到这个异常:

有没有办法来控制Pentaho的查询结果的大小? 或者是有办法流查询结果,并没有得到这一切在散装?

2014/10/09 15:14:09 - Cassandra Input.0 - ERROR (version 5.1.0.0, build 1 from 2014-06-19_19-02-57 by buildguy) : Unexpected error
2014/10/09 15:14:09 - Cassandra Input.0 - ERROR (version 5.1.0.0, build 1 from 2014-06-19_19-02-57 by buildguy) : org.pentaho.di.core.exception.KettleException: 
2014/10/09 15:14:09 - Cassandra Input.0 - Frame size (17727647) larger than max length (16384000)!
2014/10/09 15:14:09 - Cassandra Input.0 - Frame size (17727647) larger than max length (16384000)!
2014/10/09 15:14:09 - Cassandra Input.0 - 
2014/10/09 15:14:09 - Cassandra Input.0 -   at org.pentaho.di.trans.steps.cassandrainput.CassandraInput.initQuery(CassandraInput.java:355)
2014/10/09 15:14:09 - Cassandra Input.0 -   at org.pentaho.di.trans.steps.cassandrainput.CassandraInput.processRow(CassandraInput.java:234)
2014/10/09 15:14:09 - Cassandra Input.0 -   at org.pentaho.di.trans.step.RunThread.run(RunThread.java:62)
2014/10/09 15:14:09 - Cassandra Input.0 -   at java.lang.Thread.run(Unknown Source)
2014/10/09 15:14:09 - Cassandra Input.0 - Caused by: org.apache.thrift.transport.TTransportException: Frame size (17727647) larger than max length (16384000)!
2014/10/09 15:14:09 - Cassandra Input.0 -   at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:137)
2014/10/09 15:14:09 - Cassandra Input.0 -   at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
2014/10/09 15:14:09 - Cassandra Input.0 -   at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
2014/10/09 15:14:09 - Cassandra Input.0 -   at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:362)
2014/10/09 15:14:09 - Cassandra Input.0 -   at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:284)
2014/10/09 15:14:09 - Cassandra Input.0 -   at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:191)
2014/10/09 15:14:09 - Cassandra Input.0 -   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
2014/10/09 15:14:09 - Cassandra Input.0 -   at org.apache.cassandra.thrift.Cassandra$Client.recv_execute_cql_query(Cassandra.java:1656)
2014/10/09 15:14:09 - Cassandra Input.0 -   at org.apache.cassandra.thrift.Cassandra$Client.execute_cql_query(Cassandra.java:1642)
2014/10/09 15:14:09 - Cassandra Input.0 -   at org.pentaho.cassandra.legacy.LegacyCQLRowHandler.newRowQuery(LegacyCQLRowHandler.java:289)
2014/10/09 15:14:09 - Cassandra Input.0 -   at org.pentaho.di.trans.steps.cassandrainput.CassandraInput.initQuery(CassandraInput.java:333)
2014/10/09 15:14:09 - Cassandra Input.0 -   ... 3 more
2014/10/09 15:14:09 - Cassandra Input.0 - Finished processing (I=0, O=0, R=0, W=0, U=0, E=1)
2014/10/09 15:14:09 - all customer data - Transformation detected one or more steps with errors.
2014/10/09 15:14:09 - all customer data - Transformation is killing the other steps!

Answer 1:

org.apache.thrift.transport.TTransportException: 
  Frame size (17727647) larger than max length (16384000)!

甲限制强制为多大帧(节俭消息)可以避免性能下降。 您可以通过修改一些设置,调整这一点。 这里要注意的重要一点是,你需要来进行设置僵尸客户端的大小和服务器端。

服务器端 cassandra.yaml

# Frame size for thrift (maximum field length).
# default is 15mb, you'll have to increase this to at-least 18.
thrift_framed_transport_size_in_mb: 18 

# The max length of a thrift message, including all fields and
# internal thrift overhead.
# default is 16, try to keep it to thrift_framed_transport_size_in_mb + 1
thrift_max_message_length_in_mb: 19

设置客户端限制取决于你所使用的驱动程序。



Answer 2:

我通过使用PDI 5.2,其具有在卡桑德拉称为MAX_LENGTH像1GB解决了这些问题将该属性设置为更高的值输入步骤中的属性进行解析这些问题。



Answer 3:

您可以尝试在服务器端下面的方法:

TNonblockingServerSocket tnbSocketTransport = new TNonblockingServerSocket(listenPort);
TNonblockingServer.Args tnbArgs = new TNonblockingServer.Args(tnbSocketTransport);

//最大长度被配置为1GB,而默认大小为16MB

 tnbArgs.transportFactory(new TFramedTransport.Factory(1024 * 1024 * 1024)); 
tnbArgs.protocolFactory(new TCompactProtocol.Factory());
TProcessor processor = new UcsInterfaceThrift.Processor<UcsInterfaceHandler>(ucsInterfaceHandler);
tnbArgs.processor(processor);
TServer server = new TNonblockingServer(tnbArgs);
server.serve();


Answer 4:

那么它为我做的工作..

卡桑德拉版本:5.0.1 cqlsh | 卡桑德拉2.2.1 | CQL规范3.3.0 | 本地协议V4]

Pentaho的PDI版本:PDI-CE-5.4.0.1-130

在cassandra.yaml更改的设置:

# Whether to start the thrift rpc server.
start_rpc: true

# Frame size for thrift (maximum message length).
thrift_framed_transport_size_in_mb: 35

卡桑德拉输出步设置更改为:

Port: 9160
"Use CQL Version 3": checked


文章来源: Pentaho Frame size (17727647) larger than max length (16384000)!