how to get error if starting compute engine instan

2019-04-28 23:08发布

问题:

I am starting an instance using PHP using this code:

function startInstance($g_project,$g_instance, $g_zone){

    $client = new Google_Client();
    $client->setApplicationName('Google-ComputeSample/0.1');
    $client->useApplicationDefaultCredentials();
    $client->addScope('https://www.googleapis.com/auth/cloud-platform');

    $service = new Google_Service_Compute($client);
    $response = $service->instances->start($g_project, $g_zone, $g_instance);
    echo json_encode($response);

}

Today I was lucky enough to realize that for unknown reason the instance I wanted to start failed to do so. I tried starting it using GUI and got an error via GUI: Zone "some-zone" does not have enough resources available to fulfill the request. Try a different zone, or try again later.

I echoed out the PHP response and compared it to the one I get when an instance start successfully. My findings are shocking. The responses were exactly the same (not counting timestamps and ids). How on earth can I differentiate between failed instance starts and successful, if the response is the same?

https://cloud.google.com/compute/docs/reference/rest/v1/instances/start suggests that there will be an error object present in case of error. I can confirm that there is none.

Response of both failed an successful start:

{
    "clientOperationId": null,
    "creationTimestamp": null,
    "description": null,
    "endTime": null,
    "httpErrorMessage": null,
    "httpErrorStatusCode": null,
    "id": "id",
    "insertTime": "2019-01-28T14:22:36.664-08:00",
    "kind": "compute#operation",
    "name": "operation-name",
    "operationType": "start",
    "progress": 0,
    "region": null,
    "selfLink": "link/operation-name",
    "startTime": null,
    "status": "PENDING",
    "statusMessage": null,
    "targetId": "targetIdHere",
    "targetLink": "linkhere",
    "user": "user",
    "zone": "zone-in-question"
}

What do you suggest that I do? Switching to different zone is probably the best solution. But there is one problem, I don't even that the instance didn't start successfully so I can't react to it. Is this the expected behavior? What did you do mitigate this problem?

回答1:

I actually didn't observe the error you described using GCE yet, but to get the "error state" of a GCE instance, you could query the Compute API with Method: instances.get and evaluate the response for "status" and "statusMessage"

HTTP request
GET https://www.googleapis.com/compute/v1/projects/{project}/zones/{zone}/instances/{resourceId}

The return values for status may be one of the following: PROVISIONING, STAGING, RUNNING, STOPPING, STOPPED, SUSPENDING, SUSPENDED, and TERMINATED.

See also the reference manual for this API Call: https://cloud.google.com/compute/docs/reference/rest/v1/instances/get

So if you query the status of your newly created GCE instance for some time, and only return with "success" if the status of the instance switched from "PROVISIONING" or "STAGING" to "RUNNING", you should be safe. I never observed that there were any errors during instance creation, if the instance status was set to "RUNNING".