I'm facing some problems downloading big folders from HDFS using the command:
hadoop fs -get /path/to/hdfs/big/folder .
The folder is big (almost 3TB) and the kerberos ticket has a lifetime of 10 hours and a renawable lifetime of 7 days.
The download takes more than 10 hours and so I can't complete the operation (kerberos security exception). Is there any way to set the auto-renewal of the ticket for the get operation?
I solved my problem as follow:
PART 1
PART 2
Schedule inside the crontab a kerberos ticket renewal (i.e., every 6 hours):
"renewable lifetime of 7 days" means that you can renew the ticket explicitly, without providing a password, for 7 days; each renewal gives you 10h more to go.
I know of a single auto-renewal (and auto-recreation) mechanism bundled with Linux, and it's part of SSSD. So if you want to delegate Linux auth to an OpenLDAP or Microsoft AD service, after several weeks of debugging (...if you are lucky enough to ever succeed...), you will have -- optionally -- a Kerberos ticket managed for you by the OS.
There is also an auto-renewal thread started by the Hadoop Kerberos library, but it applies only to the tickets found in the cache before the connection; if you create the ticket yourself using the library (and a keytab) then it will not be renewable -- one of the many things the Kerberos implementation of Java does not handle well -- and will have to be re-created periodically.
Bottom line: you could try this kind of trick, to renew the ticket in the background until you release a "lock" after the transfer has completed.