Flink 1.2 does not start in HA Cluster mode

2019-09-14 16:37发布

问题:

I've installed Flink 1.2 in HA cluster mode 2 JobManagers 1 TaskManager locally and it kept refusing to actually start in this mode showing "Starting cluster." message instead of "Starting HA cluster with 2 masters and 1 peers in ZooKeeper quorum."

Apparently in the bin/config.sh it reads the configuration like:

# High availability
if [ -z "${HIGH_AVAILABILITY}" ]; then
     HIGH_AVAILABILITY=$(readFromConfig ${KEY_HIGH_AVAILABILITY} "" "${YAML_CONF}")
     if [ -z "${HIGH_AVAILABILITY}" ]; then
        # Try deprecated value
        DEPRECATED_HA=$(readFromConfig "recovery.mode" "" "${YAML_CONF}")
        if [ -z "${DEPRECATED_HA}" ]; then
            HIGH_AVAILABILITY="none"
        elif [ ${DEPRECATED_HA} == "standalone" ]; then
            # Standalone is now 'none'
            HIGH_AVAILABILITY="none"
        else
            HIGH_AVAILABILITY=${DEPRECATED_HA}
        fi
     else
         HIGH_AVAILABILITY="none"
     fi
fi

which means independently of what is configured for "high-availability" key in the configuration file (in my case value was "zookeeper") it will set that to "none" and in bin/start-cluster.sh

if [[ $HIGH_AVAILABILITY == "zookeeper" ]]; then
    # HA Mode
    readMasters

    echo "Starting HA cluster with ${#MASTERS[@]} masters."

    for ((i=0;i<${#MASTERS[@]};++i)); do
        master=${MASTERS[i]}
        webuiport=${WEBUIPORTS[i]}
        ssh -n $FLINK_SSH_OPTS $master -- "nohup /bin/bash -l \"${FLINK_BIN_DIR}/jobmanager.sh\" start cluster ${master} ${webuiport} &"
    done

else
    echo "Starting cluster."

    # Start single JobManager on this machine
    "$FLINK_BIN_DIR"/jobmanager.sh start cluster
fi

will never get in the first if branch.

Anyone else faced this?

回答1:

Yes I believe it is a bug: issues.apache.org/jira/browse/FLINK-6000.

It has already a pending PR.