After creating the Amazon S3 Bucket, my_bucket
, I created an Elastic Map Reduce cluster via the cli:
aws emr create-cluster --name "Hive testing" --ami-version 3.3
--applications Name=Hive --use-default-roles --instance-type m3.xlarge --instance-count 3 --steps Type=Hive,Name="Hive Program",Args=[-d,INPUT=s3://my_bucket/input,-d.OUTPUT=s3://my_bucket/input,-d-LIBS=s3://my_bucket/serde_libs]
Note that I did not specify a hive
*.q file. After making the S3 and EMR Cluster, I will log onto the EMR box, and then run hive
interactively.
Note- I'm assuming there's an EMR box onto which I can log.
However, when I ran aws emr describe-cluster --cluster-id XYZ
, I saw this error in the output:
"State": "TERMINATED_WITH_ERRORS",
"StateChangeReason": {
"Message": "EMR service role arn:aws:iam::xyz:role/EMR_DefaultRole
is invalid",
"Code": "VALIDATION_ERROR"
}
What would cause this error? Do I need to open permissions on the S3 bucket for the EMR cluster to access it?
The issue is not with the bucket but that the expected IAM role is missing.
See http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-iam-roles-creatingroles.html#emr-iam-roles-createdefaultwithcli
Issue the AWS CLI command:
aws emr create-default-roles
Then create the cluster again. This is a one-time step needed to create the default roles.
note:
beware of using a recent version of aws cli, I had problems with 1.4 (debian jessie package)
note 2: taken from mrjob code and amazon annoucments:
instance profile and service role are required for accounts created
after April 6, 2015, and will eventually be required for all accounts
I've seen this issue crop up when you create custom service roles and assign the wrong principal service.
This example will generate that error:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "ec2.amazonaws.com"
},
"Effect": "Allow",
"Sid": "Invalid"
}
]
}
This example will not:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "elasticmapreduce.amazonaws.com"
},
"Effect": "Allow",
"Sid": "Valid"
}
]
}
For more info see here: http://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-mgmt.pdf#emr-plan-access-iam