How to troubleshoot deployment of Inception servin

2019-02-16 02:32发布

I'm following the Serving Inception Model with TensorFlow Serving and Kubernetes workflow and everything work well up to the point of the final serving of the inception model via k8s when I am trying to do inference from a local host.

I'm getting the pods running and the output of $kubectl describe service inception-service is consistent with what is suggested by the workflow in the Serving Inception Model with TensorFlow Serving and Kubernetes.

However, when running inference things don't work. Here is the trace:

$bazel-bin/tensorflow_serving/example/inception_client --server=104.155.175.138:9000 --image=cat.jpg

Traceback (most recent call last):
File "/home/dimlyus/serving/bazel-
bin/tensorflow_serving/example/inception_client.runfi
les/tf_serving/tensorflow_serving/example/inception_client.py", line 56, in 
tf.app.run()

File "/home/dimlyus/serving/bazel-
bin/tensorflow_serving/example/inception_client.runfi
les/org_tensorflow/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))

File "/home/dimlyus/serving/bazel-
bin/tensorflow_serving/example/inception_client.runfi
les/tf_serving/tensorflow_serving/example/inception_client.py", line 51, in 
main
result = stub.Predict(request, 60.0) # 10 secs timeout

File "/usr/local/lib/python2.7/dist-
packages/grpc/beta/_client_adaptations.py", line 32
4, in call
self._request_serializer, self._response_deserializer)

File "/usr/local/lib/python2.7/dist-
packages/grpc/beta/_client_adaptations.py", line 21
0, in _blocking_unary_unary
raise _abortion_error(rpc_error_call)
grpc.framework.interfaces.face.face.AbortionError: 
AbortionError(code=StatusCode.UNAVAILABLE, details="Connect Failed")

I am running everything on Google Cloud. The setup is done from a GCE instance and the k8s is run inside of Google Container Engine. The setup of the k8s follows the instructions from the workflow linked above and uses the inception_k8s.yaml file.

The service is set as follows:

apiVersion: v1
kind: Service
metadata:
  labels:
    run: inception-service
  name: inception-service
spec:
  ports:
  - port: 9000
    targetPort: 9000
  selector:
    run: inception-service
  type: LoadBalancer

Any advice on how to troubleshoot this would be greatly appreciated!

2条回答
何必那么认真
2楼-- · 2019-02-16 03:25

I figured it out with the help of several tensorflow experts. Things started to work after I introduced the following changes:

First, I changed inception_k8s.yaml file in the following way:

Source:

args:
    - /serving/bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server
      --port=9000 --model_name=inception --model_base_path=/serving/inception-export

Modification:

args:
    - serving/bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server
      --port=9000 --model_name=inception --model_base_path=serving/inception-export

Second, I exposed the deployment:

kubectl expose deployments inception-deployment --type=“LoadBalancer” 

and I used the IP generated from exposing the deployment, not the inception-service IP.

From this point I am able to run the inference from an external host where the client is installed using the command from the Serving Inception Model with TensorFlow Serving and Kubernetes.

查看更多
Fickle 薄情
3楼-- · 2019-02-16 03:28

The error message seems to indicate that your client cannot connect to the server. Without some additional information it is hard to trouble shoot. If you post your deployment and service configuration as well as give some information about the environement (is it running on a cloud? which one? what are your security rules? load balancers?) we may be able to help better.

But here some things that you can check right away:

  1. If you are running in some kind of cloud environment (Amazon, Google, Azure, etc.), they all have security rules where you need to explicitly open the ports on the nodes running your kubernetes cluster. So every port that your Tensorflow deployment/service is using should be opened on the Controller and Worker nodes.

  2. Did you deploy only a Deployment for the app or also a Service? If you run a Service how does it expose? Did you forget to enable a NodePort?

Update: Your service type is load balancer. So there should be a separate load balancer be created in GCE. you need to get the IP of the load balancer and access the service through the load balancer's ip. Please see the section 'Finding Your IP' in this link https://kubernetes.io/docs/tasks/access-application-cluster/create-external-load-balancer/

查看更多
登录 后发表回答