How to connect to a Kerberos-secured Apache Phoeni

2019-06-04 19:45发布

问题:

I have recently spent several weeks trying to get WildFly to successfully connect to a Kerberized Apache Phoenix data source. There is a surprisingly limited amount of documentation on how to do this, but now that I have cracked it, I'm sharing.

Environment:

  • WildFly 9+. An equivalent JBoss version should also work (but untested). WildFly 8 does not contain the required org.jboss.security.negotiation.KerberosLoginModule class (but you can hack it, see Kerberos sql server datasource in Wildfly 8.2). I used WildFly 10.1.0.Final, and used a standalone deployment.
  • Apache Phoenix 4.2.0.2.2.4.10. I have not tested any other version.
  • Kerberos v5. My KDC is running on Windows Active Directory, but this should not make a noticable difference.
  • My Hadoop environment is a HortonWorks version, and maintained by Ambari. Ambari ensures that all of the configuration files and Kerberos implementation settings are correct.

回答1:

Firstly, you'll want to add a system property to WildFly's standalone.xml to specify the location of the Kerberos configuration file:

...
</extensions>

<system-properties>
    <property name="java.security.krb5.conf" value="/path/to/krb5.conf"/>
</system-properties>
...

I'm not going to go into the format of the krb5.conf file here, as it is dependent on your own implementation of Kerberos. What is important is that it contains the default realm and network location of the KDC. On Linux you can normally find it at /etc/krb5.conf or /etc/security/krb5.conf. If you're running WildFly on Windows, then make sure you use forward-slashes in your path, e.g. "C:/Source/krb5.conf"

Secondly, add two new security domains to standalone.xml - one called "Client" which is used by ZooKeeper, and another called "host", which is used by WildFly. Do not ask me why (it caused me so much pain) but the name of the "Client" security domain must match that defined in Zookeeper's JAAS client configuration file on the server. If you've set up with Ambari, "Client" is the default name. Also note that you cannot simply provide a jaas.config file as a system property, you must define it here:

<security-domain name="Client" cache-type="default">
    <login-module code="com.sun.security.auth.module.Krb5LoginModule" flag="required">
        <module-option name="useTicketCache" value="true"/>
        <module-option name="debug" value="true"/>
    </login-module>
</security-domain>
<security-domain name="host" cache-type="default">
    <login-module code="org.jboss.security.negotiation.KerberosLoginModule" flag="required" module="org.jboss.security.negotiation">
        <module-option name="useTicketCache" value="true"/>
        <module-option name="debug" value="true"/>
        <module-option name="refreshKrb5Config" value="true"/>
        <module-option name="addGSSCredential" value="true"/>
    </login-module>
</security-domain>

The module options will vary depending on your implementation. I'm getting my tickets from the default Java ticket cache, which is defined in the java.security file of your JRE, but you can supply a keytab here if you want. Note that setting storeKey to true broke my implementation. Check the Java documentation for all of the options. Note that each security domain uses a different login module: this is not by accident - Phoenix does not know how to use the org.jboss... version.

Now you need to provide WildFly with the org.apache.phoenix.jdbc.PhoenixDriver class in phoenix-<version>-client.jar. Create the following directory tree under the WildFly directory:

/modules/system/layers/base/org/apache/phoenix/main/

In the main directory, paste the phoenix--client.jar which you can find on the server (e.g. /usr/hdp/<version>/phoenix/client/bin) and create a module.xml file:

<?xml version="1.0" ?>

<module xmlns="urn:jboss:module:1.1" name="org.apache.phoenix">

    <resources>
        <resource-root path="phoenix-<version>-client.jar">
            <filter>
                <exclude-set>
                    <path name="javax" />
                    <path name="org/xml" />
                    <path name="org/w3c/dom" />
                    <path name="org/w3c/sax" />
                    <path name="javax/xml/parsers" />
                    <path name="com/sun/org/apache/xerces/internal/jaxp" />
                    <path name="org/apache/xerces/jaxp" />
                    <path name="com/sun/jersey/core/impl/provider/xml" />
                </exclude-set>
            </filter>
        </resource-root>
        <resource-root path=".">
        </resource-root>
    </resources>

    <dependencies>
        <module name="javax.api"/>
        <module name="sun.jdk"/>
        <module name="org.apache.log4j"/>
        <module name="javax.transaction.api"/>
        <module name="org.apache.commons.logging"/>
    </dependencies>
</module>

You also need to paste the hbase-site.xml and core-site.xml from the server into the main directory. These are typically located in /usr/hdp/<version>/hbase/conf and /usr/hdp/<version>/hadoop/conf. If you don't add these, you will get a lot of unhelpful ZooKeeper getMaster errors! If you want the driver to log to the same place as WildFly, then you should also create a log4j.xml file in the main directory. You can find an example elsewhere on the web. The <resource-root path="."></resource-root> element is what adds those xml files to the classpath when deployed by WildFly.

Finally, add a new datasource and driver in the <subsystem xmlns="urn:jboss:domain:datasources:2.0"> section. You can do this with the CLI or by directly editing standalone.xml, I did the latter:

<datasource jndi-name="java:jboss/datasources/PhoenixDS" pool-name="PhoenixDS" enabled="true" use-java-context="true">
    <connection-url>jdbc:phoenix:first.quorumserver.fqdn,second.quorumserver.fqdn:2181/hbase-secure</connection-url>
    <connection-property name="phoenix.connection.autoCommit">true</connection-property>
    <driver>phoenix</driver>
    <validation>
        <check-valid-connection-sql>SELECT 1 FROM SYSTEM.CATALOG LIMIT 1</check-valid-connection-sql>
    </validation>
    <security>
        <security-domain>host</security-domain>
    </security>
</datasource>
<drivers>
    <driver name="phoenix" module="org.apache.phoenix">
        <xa-datasource-class>org.apache.phoenix.jdbc.PhoenixDriver</xa-datasource-class>
    </driver>
</drivers>

It's important that you replace first.quorumserver.fqdn,second.quorumserver.fqdn with the correct ZooKeeper quorum string for your environment. You can find this in hbase-site.xml in the HBase configuration directory: hbase.zookeeper.quorum. You don't need to add Kerberos information to the connection URL string!

tl;dr

  • Make sure that hbase-site.xml and core-site.xml are in your classpath.
  • Make sure that you have a <security-domain> with a name that ZooKeeper expects (probably "Client"), that uses the com.sun.security.auth.module.Krb5LoginModule.
  • The Phoenix connection URL must contain the entire ZooKeeper quorum. You can't miss one server out! Make sure it matches the value in hbase-site.xml.

References:

  • Using Kerberos for Datasource Authentication
  • Phoenix data source configuration by Mark S