Flink 1.7.0 Dashboard not show Task Statistics

2019-07-30 11:02发布

问题:

I use Flink 1.7 dashboard and select a streaming job. This should show me some metrics, but it remains to load.

I deployed the same job in a Flink 1.5 cluster, and I can watch the metrics. Flink is running in docker swarm, but if I run Flink 1.7 in docker-compose (not in the swarm), it works

I can do it work, deleting the hostname in docker-compose.yaml file

version: "3"
services:
  jobmanager17:
    image: flink:1.7.0-hadoop27-scala_2.11
    hostname: "{{.Node.Hostname}}"
    ports:
      - "8081:8081"
      - "9254:9249"
    command: jobmanager
....

I delete the host name:

version: "3"
services:
  jobmanager17:
    image: flink:1.7.0-hadoop27-scala_2.11
    ports:
      - "8081:8081"
      - "9254:9249"
    command: jobmanager
....

and now the metrics works, but without the hostname...

Is it possible to have both?

PD: I read something about 'detached mode'... but I don't use it

回答1:

I guess you are running your cluster on Kubernetes or docker swarm. With Flink 1.7 on Kubernetes you need to make sure the task managers are registering to the job manager with their IP addresses and not the hostnames. If you look at the jobmanagers log you'll find a lot of warnings that the Taskmanager can't be reached.

You can do that by passing defining the taskmanager.host parameter. An example depoyment might look like this:

apiVersion: extensions/v1beta1
kind: Deployment
....
spec:
  template:
    spec:
      containers:
      - name: "<%= name %>"
        args: ["taskmanager", "-Dtaskmanager.host=$(K8S_POD_IP)"]
        env:
          - name: K8S_POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP

If you are not running on K8 it might be worth a try to pass this parameter manually (by providing an IP adress which is reachable from the jobmanager as the taskmanager.host)

Hope that helps.


Update: Flink 1.8 solves the problem. The property taskmanager.network.bind-policy is by default set to "ip" which does more or less the same what the above described workaround does (https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html#taskmanager)