Stackdriver logs not available for Cloud ML jobs s

2020-04-26 04:11发布

问题:

Since migration to V2 logs from Cloud ML jobs are not accessible on the Stackdriver logging console anymore. The last log displayed is

Waiting for Tensorflow to start.

The job is executed and completed successfully, I just can't access outputs in the logs

All Stackdriver APIs are enabled for the project.

回答1:

There are no known issues with Cloud ML's Stackdriver logging. The fact that you see "Waiting for Tensorflow to start." indicates you are seeing log messages from Cloud ML.

If logs from your Python/TensorFlow program are missing that usually indicates Cloud ML hasn't been authorized to send logs to Stackdriver logging for your project. To check permissions do the following

  1. Identify the Cloud ML service account by following these instructions
  2. In the Cloud Console select the IAM Tab
  3. Verify that the Cloud ML service account is listed and has Logs Writer permissions


回答2:

This problem also took me two weeks to search answers online with frustration, until I came across this post. I did not see "migration to V2" as OP mentions but I simply could not get any application logs in StackDriver, only system logs of job started/completed. Following what Jeremy replies solves the problem.

To make Jeremy's reply simpler to follow, essentially you add the ML service account

cloud-ml-service@<project-id>.iam.gserviceaccount.com

to your project's IAM members, with at least "Logs Writer" role.

You can get "project-id" by:

gcloud config list project --format "value(core.project)"

I also assigned Project->Editor role to allow Bucket access.