I set-up a HDP cluster which contains amongst others Spark. I also enabled Kerberos for the cluster, so that all services and users have to authenticate via their principals.
This seems to work fine, all services are running, and an user has to get a valid Kerberos ticket before he can access e.g. the YARN ResourceManager's Web UI.
Else he gets an error message like this:
However, after making a kinit
the website is accessable by the user.
What I now want to do (I thought it already is), is to secure also the Spark History Server UI like that, so that a user has to authenticate via Kerberos ticket. Actually everyone can access the UI without authentication:
Is there a possibility to do this at all? If yes, how can I configure this?
The actual permissions on the spark.eventLog.dir = hdfs:///spark-history
are 777
. Here a screenshot of the Ambari HDFS view:
I have found a solution to this in IBM's documentation
You re-use Hadoop's jetty authentication filter for Kerberos/SPNEGO
org.apache.hadoop.security.authentication.server.AuthenticationFilter
You can do this by setting in Spark's default.conf
spark.ui.filters=org.apache.hadoop.security.authentication.server.AuthenticationFilter
andspark.org.apache.hadoop.security.authentication.server.AuthenticationFilter.params= type=kerberos,kerberos.principal=${spnego_principal_name},kerberos.keytab=${spnego_keytab_path}
Be careful with those replacement variables, they didn't work for me when setting these values in Ambari. Also consider addingcookie.domain
andsignature.secret.file
similar to the other Hadoop SPNEGO-configurations.Obviously this only works when the Spark History Server runs with the Hadoop-classes in its classpath -- so it's not an out-of-the-box solution for a SMACK-stack for example.