Vagrant “Authentication failure” during up, but “v

2019-03-29 08:19发布

问题:

I'm stumped. I'm trying to run a vagrant/virtualbox/coreos cluster on Windows 8.1 to develop the cluster for running in the cloud. I've tried this on four machines (all are Windows 8.1 with latest updates and all with the latest VirtualBox, Vagrant, Git, and the same config for Vagrant. I'm checking the Vagrant config out of a repo on all 4 system so I'm confident the configs are the same for each. I get 2 successes and 2 failures.

Two machines succeed like this:

Bringing machine 'core-01' up with 'virtualbox' provider...
==> core-01: Checking if box 'coreos-stable' is up to date...
(snip)
    core-01: SSH address: 127.0.0.1:2222
    core-01: SSH username: core
    core-01: SSH auth method: private key
    core-01: Warning: Connection timeout. Retrying...
==> core-01: Machine booted and ready!
==> core-01: Setting hostname...
==> core-01: Configuring and enabling network interfaces...

vagrant ssh and vagrant halt both work fine on these two systems.

Two other Windows machines fail like this:

Bringing machine 'core-01' up with 'virtualbox' provider...
==> core-01: Importing base box 'coreos-stable'...
==> core-01: Matching MAC address for NAT networking...
==> core-01: Checking if box 'coreos-stable' is up to date...
==> core-01: Setting the name of the VM: coreos-vm-cluster_core-01_1422899531630_88904
==> core-01: Clearing any previously set network interfaces...
==> core-01: Preparing network interfaces based on configuration...
    core-01: Adapter 1: nat
    core-01: Adapter 2: hostonly
==> core-01: Forwarding ports...
    core-01: 22 => 2222 (adapter 1)
==> core-01: Running 'pre-boot' VM customizations...
==> core-01: Booting VM...
==> core-01: Waiting for machine to boot. This may take a few minutes...
    core-01: SSH address: 127.0.0.1:2222
    core-01: SSH username: core
    core-01: SSH auth method: private key
    core-01: Warning: Connection timeout. Retrying...
    core-01: Warning: Authentication failure. Retrying...
    core-01: Warning: Authentication failure. Retrying...
    core-01: Warning: Authentication failure. Retrying...
    core-01: Warning: Authentication failure. Retrying...
    core-01: Warning: Authentication failure. Retrying...
    core-01: Warning: Authentication failure. Retrying...

Note how both the working and non-working systems experience one timeout connecting, but then the successful ones actually do connect and finish bringing up the VM, whereas the unsuccessful ones just get stuck with an authentication retry loop.

Following the authentication failure, if I leave it to time out or even if I ctrl+C, I can run "vagrant ssh core-01" and it takes me straight in:

CoreOS (stable)
core@localhost ~ $

'vagrant halt' also fails to make an ssh connection on these systems:

==> core-01: Attempting graceful shutdown of VM...
    core-01: Guest communication could not be established! This is usually because
    core-01: SSH is not running, the authentication information was changed,
    core-01: or some other networking issue. Vagrant will force halt, if
    core-01: capable.
==> core-01: Forcing shutdown of VM...

I can successfully use putty or other ssh clients to access the VM using insecure_private_key for authentication, so I'm assuming the VM itself has the correct config, and the problem lay with Vagrant's ability to call ssh to get in. If "Vagrant up" can't ssh in, it cannot finish the startup config for the VM, so I'd like to solve this primarily for that reason.

This is the ssh config that lets me get in with other ssh clients and I believe should be used by Vagrant:

Host: 127.0.0.1
Port: 2222
Username: core
Private key: C:/Users/Mike/.vagrant.d/insecure_private_key

I have also enabled GUI for the VM's and the console does not show any errors; it gets all the way to a login prompt just fine (which is also consistent with the fact that I can ssh in and otherwise use the VM).

I believe (but don't know how to verify) that Vagrant is calling the openssh client in C:\Program Files (x86)\Git\bin

All are running Vagrant version 1.7.2 and git 1.9.5. Ruby 2.0.0p353.

My %PATH% is about 500 chars long. I'm confident Vagrant is finding an ssh client of some sort due to getting at least one or two timeouts followed by an authentication failure.

Thanks in advance for any ideas!

Update: Buried deep in the output of "vagrant up --debug" is this little gem:

D, [2015-02-02T23:11:10.755468 #3920] DEBUG --
   net.ssh.authentication.session[14661cc]: trying publickey
E, [2015-02-02T23:11:10.756472 #3920] ERROR --
   net.ssh.authentication.key_manager[1473e1c]:
   could not load public key file
   `C:/Users/Mike/.vagrant.d/insecure_private_key': 
   Net::SSH::Exception (public key at
   C:/Users/Mike/.vagrant.d/insecure_private_key.pub is not valid)

That final "insecure_private_key.pub is not valid" seems like a solid clue.

I've tried modifying that file to ensure it has just LF for line endings as well as CRLF and it makes no difference. Visually it looks fine. It's also 100% byte-for-byte identical to the file that's working on one of the other systems. Why would it be invalid? I have verified the current user has full control permissions on the file and also tried vagrant up as Administrator. No change in behavior. :(

回答1:

Remove
C:/Users/Mike/.vagrant.d/insecure_private_key

on next vagrant restart it will be created again (this time should be correct)



回答2:

Was the .pub file created by Puttygen (perhaps when creating a private key in Putty's format)? I did that and it prevented vagrant from connecting to the box, but I could connect using Putty and Puttygen's generated .ppk file.

Changing the extension on the Putty public key worked for me, presumably because Vagrant didn't try using it any more.



回答3:

When I created the PPK file of the insecure_private_key file, I also --out of habit-- created a .pub version. This appeared to cause the problem. Like Jon, when I removed the insecure_private_key.pub file, vagrant up was able to run all the way through.

If you have created an insecure_private_key.pub file using puttygen and run into this problem, I suggest removing it. It is not needed for vagrant and it only got in the way.