`EMR service role is invalid` when Creating EMR Cl

2019-04-20 18:59发布

问题:

After creating the Amazon S3 Bucket, my_bucket, I created an Elastic Map Reduce cluster via the cli:

aws emr create-cluster --name "Hive testing" --ami-version 3.3 --applications Name=Hive --use-default-roles --instance-type m3.xlarge --instance-count 3 --steps Type=Hive,Name="Hive Program",Args=[-d,INPUT=s3://my_bucket/input,-d.OUTPUT=s3://my_bucket/input,-d-LIBS=s3://my_bucket/serde_libs]

Note that I did not specify a hive *.q file. After making the S3 and EMR Cluster, I will log onto the EMR box, and then run hive interactively.

Note- I'm assuming there's an EMR box onto which I can log.

However, when I ran aws emr describe-cluster --cluster-id XYZ, I saw this error in the output:

   "State": "TERMINATED_WITH_ERRORS", 
        "StateChangeReason": {
            "Message": "EMR service role arn:aws:iam::xyz:role/EMR_DefaultRole 
                         is invalid", 
            "Code": "VALIDATION_ERROR"
        }

What would cause this error? Do I need to open permissions on the S3 bucket for the EMR cluster to access it?

回答1:

The issue is not with the bucket but that the expected IAM role is missing.

See http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-iam-roles-creatingroles.html#emr-iam-roles-createdefaultwithcli

Issue the AWS CLI command:

aws emr create-default-roles 

Then create the cluster again. This is a one-time step needed to create the default roles.

  • note: beware of using a recent version of aws cli, I had problems with 1.4 (debian jessie package)

  • note 2: taken from mrjob code and amazon annoucments:

    instance profile and service role are required for accounts created after April 6, 2015, and will eventually be required for all accounts



回答2:

I've seen this issue crop up when you create custom service roles and assign the wrong principal service.

This example will generate that error:

{
   "Version": "2012-10-17",
   "Statement": [
     {
       "Action": "sts:AssumeRole",
       "Principal": {
         "Service": "ec2.amazonaws.com"
       },
       "Effect": "Allow",
       "Sid": "Invalid"
     }
   ]
}

This example will not:

{
   "Version": "2012-10-17",
   "Statement": [
     {
       "Action": "sts:AssumeRole",
       "Principal": {
         "Service": "elasticmapreduce.amazonaws.com"
       },
       "Effect": "Allow",
       "Sid": "Valid"
     }
   ]
}

For more info see here: http://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-mgmt.pdf#emr-plan-access-iam