I need to update /etc/hosts
for all instances in my EMR cluster (EMR AMI 4.3).
The whole script is nothing more than:
#!/bin/bash
echo -e 'ip1 uri1' >> /etc/hosts
echo -e 'ip2 uri2' >> /etc/hosts
...
This script needs to run as sudo
or it fails.
From here: https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-bootstrap.html#bootstrapUses
Bootstrap actions execute as the Hadoop user by default. You can execute a bootstrap action with root privileges by using sudo.
Great news... but I can't figure out how to do this, and I can't find an example.
I've tried a bunch of things... including...
- running as Hadoop and adding 'sudo' to each of the 'echo' statements in the script
- using a shell script to copy and chmod the above ('echo' statements with no 'sudo') and running local copy using run-if bootstrap that calls
1=1 sudo bash /home/hadoop/myDir/myScript.sh
- hard coding the whole script as a one-liner into a run-if bootstrap action
I consistently get:
On the master instance (i-xxx), bootstrap action 2 returned a non-zero return code
If i check the logs for the "Setup hadoop debugging" step, there's nothing there.
summary emr setup (in order):
- provisions ec2 instances
- runs bootstrap actions
- installs native applications... like hadoop, spark, etc.
So it seems like there's some risk that since I'm mucking around as user Hadoop before hadoop is installed, I could be messing something up there, but I can't imagine what.
I think it must be that my script isn't running as 'sudo' and it's failing to update /etc/hosts
.
My question... how can I use bootstrap actions (or something else) on EMR to run a simple shell script as sudo? ...specifically to update /etc/hosts
?
I've not had problems using sudo from within a shell script run as an EMR bootstrap action, so it should work. You can test that it works with a simple script that simply does "sudo ls /root".
Your script is trying to append to /etc/hosts by redirecting stdout with:
The problem here is that while the echo is run with sudo, the redirection (>>) is not. It's run by the underlying hadoop user, who does not have permission to write to /etc/hosts. The fix is:
This runs the entire command, including the stdout redirection, in a shell with sudo.