How to connect to remote HBase in Java?

2019-01-30 14:55发布

I have a standlone HBase server. This is my hbase-site.xml:

<configuration>
 <property>
    <name>hbase.rootdir</name>
    <value>file:///hbase_data</value>
  </property>
</configuration>

I am trying to write a Java program to manipulate the data in the HBase.

If I run the program on the HBase server, it works fine. But I don't know how to config it for remote access.

  Configuration config = HBaseConfiguration.create();
   HTable table = new HTable(config, "test");
   Scan s = new Scan();

I have tried adding IP and Port, it doesn't work:

config.set("hbase.master", "146.169.35.28:60000")

Can anyone tell me how to do it?

Thanks!

标签: hbase
5条回答
地球回转人心会变
2楼-- · 2019-01-30 15:07

As far as i know, If you want to connect to an remote hbase server the normal java client doesn't work,in which we just declare the configuration and try to connect to the remote hbase as mentioned in precious answers.

I have tried this above stuff but never succeeded in it. Instead i used Thrift API for connecting to a remote server,

This link is the best example of using Thrift API java client.It surely works.I am using the same. But before using it carefully go through the code and emit those items which you don't need. I am also giving the sample code for the same which successfully works.

public class ThriftClient 
{

    port = 9090;
    //Connection to hbase
    TTransport transport = new TSocket(hostname, port);
    TProtocol protocol = new TBinaryProtocol(transport, true, true);
    Hbase.Client client = new Hbase.Client(protocol);

    transport.open();

    int z=Link.length();
    byte[] tablename = bytes("YOUR TABLE NAME");

    // Create the demo table with two column families, entry: and unused:
    ArrayList<ColumnDescriptor> columns = new ArrayList<ColumnDescriptor>();
    ColumnDescriptor col = null;
    col = new ColumnDescriptor();
    col.name = ByteBuffer.wrap(bytes("YOUR_COLUMN_FAMILY_NAME"));
    col.maxVersions = 10;
    columns.add(col);

    System.out.println("creating table: " + utf8(tablename));
    try 
    {
        client.createTable(ByteBuffer.wrap(tablename), columns);
    } 
    catch (AlreadyExists ae) 
    {
        System.out.println("WARN: " + ae.message);
    }

    Map<ByteBuffer, ByteBuffer> dummyAttributes = null;
    boolean writeToWal = false;
    // Test UTF-8 handling
    byte[] invalid = {(byte) 'f', (byte) 'o', (byte) 'o', (byte) '-',
        (byte) 0xfc, (byte) 0xa1, (byte) 0xa1, (byte) 0xa1, (byte) 0xa1};
    byte[] valid = {(byte) 'f', (byte) 'o', (byte) 'o', (byte) '-',
        (byte) 0xE7, (byte) 0x94, (byte) 0x9F, (byte) 0xE3, (byte) 0x83,
        (byte) 0x93, (byte) 0xE3, (byte) 0x83, (byte) 0xBC, (byte) 0xE3,
        (byte) 0x83, (byte) 0xAB};


    ArrayList<Mutation> mutations;

    // Run some operations on a bunch of rows

    NumberFormat nf = NumberFormat.getInstance();
    nf.setMinimumIntegerDigits(10);
    nf.setGroupingUsed(false);
    byte[] row=bytes("YOUR ROW NAME");

    mutations = new ArrayList<Mutation>();
    mutations.add(new Mutation(false, ByteBuffer.wrap(bytes("YOUR_COLUMN_FAMILY_NAME:YOUR_COLUMN_NAME")), ByteBuffer.wrap(bytes("YOUR_ROW_VALUE")), writeToWal));
    client.mutateRow(ByteBuffer.wrap(tablename), ByteBuffer.wrap(row), mutations, dummyAttributes);

    transport.close();

    // Helper to translate byte[]'s to UTF8 strings
private static String utf8(byte[] buf) {
    try {
        return decoder.decode(ByteBuffer.wrap(buf)).toString();
    } catch (CharacterCodingException e) {
        return "[INVALID UTF-8]";
    }
}

// Helper to translate strings to UTF8 bytes
private static byte[] bytes(String s) {
    try {
        return s.getBytes("UTF-8");
    } catch (UnsupportedEncodingException e) {
        e.printStackTrace();
        return null;
    }
}
}
查看更多
Bombasti
3楼-- · 2019-01-30 15:14

hbase.master is @Deprecated. Clients use Zookeeper to get current hostname/port of their HBase servers.

@Deprecated
config.set("hbase.master", "146.169.35.28:60000")

Hadoop and HBase are very sensitive to DNS and /etc/hosts configuration. Make sure, your hostname doesn't point to 127.0.0.1 otherwise it will start many services listening on localhost only. Try not to use IP addresses anywhere in settings.

My /etc/hosts:

192.168.2.3     cloudera-vm     # Added by NetworkManager
127.0.0.1       localhost.localdomain   localhost
127.0.1.1       cloudera-vm-local localhost

/etc/hbase/hbase-site.xml should have settings set distributed=false (since you are using this for testing only):

<property>
  <name>hbase.cluster.distributed</name>
  <value>false</value>
</property>

/etc/zookeeper/zoo.cfg

# the port at which the clients will connect
clientPort=2181
server.0=cloudera-vm:2888:3888

List of my Java processes:

root@cloudera-vm:~# jps
1643 TaskTracker
1305 JobTracker
1544 SecondaryNameNode
2037 Bootstrap
9622 DataNode
10144 Jps
9468 NameNode
1948 RunJar
9746 HMaster
查看更多
萌系小妹纸
4楼-- · 2019-01-30 15:22

In my case after playing a lot with /etc/hosts I ended up finding in log file "hbase-bgi-master-servername.log" the following line:

"2017-11-21 19:56:32,999 INFO [RS:0;servername:45553] regionserver.HRegionServer: Serving as servername.local.lan,45553,1511290584538, RpcServer on servername.local.lan/172.0.1.2:45553, sessionid=0x15fdff039790002"

Always make sure that the full host name ("servername.local.lan" in my case) actually points to the server's IP on both client and server side.

查看更多
成全新的幸福
5楼-- · 2019-01-30 15:28

Here's a snippet from a system we use to create an HTable we use to connect to HBase

Configuration hConf = HBaseConfiguration.create(conf);
hConf.set(Constants.HBASE_CONFIGURATION_ZOOKEEPER_QUORUM, hbaseZookeeperQuorum);
hConf.setInt(Constants.HBASE_CONFIGURATION_ZOOKEEPER_CLIENTPORT, hbaseZookeeperClientPort);

HTable hTable = new HTable(hConf, tableName);

HTH

EDIT: Example Values:

public static final String HBASE_CONFIGURATION_ZOOKEEPER_QUORUM                     = "hbase.zookeeper.quorum";
public static final String HBASE_CONFIGURATION_ZOOKEEPER_CLIENTPORT                 = "hbase.zookeeper.property.clientPort";
...
hbaseZookeeperQuorum="PDHadoop1.corp.CompanyName.com,PDHadoop2.corp.CompanyName.com";
hbaseZookeeperClientPort=10000;
tableName="HBaseTableName";
查看更多
别忘想泡老子
6楼-- · 2019-01-30 15:30

In a nutshell this is what I use:

    Configuration hBaseConfig =  HBaseConfiguration.create();
    hBaseConfig.setInt("timeout", 120000);
    hBaseConfig.set("hbase.master", "*" + hbaseHost + ":9000*");
    hBaseConfig.set("hbase.zookeeper.quorum",zookeeperHost);
    hBaseConfig.set("hbase.zookeeper.property.clientPort", "2181");

For hBaseHost and zookeeperHost I simply pass the ip address of a cluster computer that has zookeeper installed. Of course you can parametize the port numbers too. I am not 100% sure this is the best way to ensure a successful connection but so far it works without any issues.

查看更多
登录 后发表回答