I'm using an instanced Amazon EC2 virtual Ubuntu 12.04 server as my single Riak node. I've gone through all the proper stages of setting up Riak on the instance using the guide on the basho website here. Where x.x.x.x
is the private IP address of the instance, this included:
Installation
Using sudo su -
to gain root privileges (EC2 logs me in as 'Ubuntu').
Installing the SSL Lib with:
sudo apt-get install libssl0.9.8
Downloading the 64-bit package for 12.04:
wget http://downloads.basho.com.s3-website-us-east-1.amazonaws.com/riak/CURRENT/ubuntu/precise/riak_1.2.1-1_amd64.deb
Then unpacking via:
sudo dpkg -i riak_1.2.1-1_amd64.deb
As instructed in the basho guide, I updated these two files (using vi):
vm.args
- Changing
-name riak@x.x.x.x
to the private IP of my instance.
app.config
Changing {http, [ {"x.x.x.x", 8098 } ]}
to the private IP of my instance.
Changing {pb_ip, "x.x.x.x"}
to the private IP of my instance.
The Riak node was working fine when I first setup the server and performed the above, I could connect to the node, and using riak start
then riak-admin test
returned successfully with:
>Attempting to restart script through sudo -H -u riak
>Successfully completed 1 read/write cycle to 'riak@x.x.x.x'
The next day I fired up the instance, repeated the above process (ignoring installation) with the instance's new IP address y.y.y.y
(the private IP of the instance changes every time it stops/starts) and typed riak start
into the terminal, only to be greeted with:
>Attempting to restart script through sudo -H -u riak
>Riak failed to start within 15 seconds,
>see the output of 'riak console' for more information.
>If you want to wait longer, set the environment variable
>WAIT_FOR_ERLANG to the number of seconds to wait
In the riak console the error given is:
>gen_server riak_core_capability terminated with reason: no function clause matching orddict:fetch('riak@y.y.y.y', [{'riak@x.x.x.x',[{{riak_core,staged_joins},[true,false]},{{riak_core,vnode_routing},[proxy,...]},...]}])
Where y.y.y.y
is the new instance IP address and x.x.x.x
was the old one.
I've been scratching my head over this for a while now and can't find anything on the topic, the only solution I can think of is to re-install Riak on the off chance my PATH directories have gone wrong. If that fails my last resort would be to terminate the instance and reconfigure Riak on a new instance. So before I jump the gun, what I would like to ask is:
After updating the fields in app.config
and vm.args
with the new instance IP address, why is the riak start
command no longer successful?
Is there any way for an Ubuntu EC2 instance to be assigned with a static private IP? Not only would this help solve the problem, but saves me time having to update app.config
and vm.args
every time I start/stop the instance.
So after some more digging around and intense reading, I've found a solution:
You need to remove the Riak ring and start Riak again to reset riak_core.
You can do this by using this command in the terminal:
rm -rf /var/lib/riak/ring/*
- NOTE: This should be done after you've updated
app.config
and vm.args
with the new server IP, nasty side-effects can occur otherwise.
Then
riak start
I was no longer thrown a 'failed to connect' error, and after issuing a riak-admin test
command I pleasantly received (where y.y.y.y
is my instance's private IP):
>Attempting to restart script through sudo -H -u riak
>Successfully completed 1 read/write cycle to 'riak@y.y.y.y'
I should note that this solution applies to virtual servers as well as physical ones. Although I would imagine the reassigning of IP's would be a much rarer occurrence in physical servers.
Now while that solves the issue, it still means whenever I need to reboot the instance I have to go through editing the app.config
and vm.args
files to change the private IP address (remember the private IP changes every time an Ubuntu instance is started/stopped) and then clear the Riak ring using the command above, so it's not exactly an elegant solution.
If anyone knows a way to set a static private IP to an EC2 instance (or another solution that tackles both hurdles?) it would solve this problem outright.
EDIT: 14/12/12
A limited solution to assigning a static IP to an EC2 instance:
Amazon Web Services allows the association of Elastic IP's to EC2 instances (of any kind). Therefore, if an instance has an elastic IP associated with it, even if it is rebooted, that IP will remain associated with that instance. You can find the documentation on elastic IP's here.
If you're under Amazon's free usage tier, creating an Elastic IP shouldn't charge you as long as it's associated with a running instance. If an elastic IP is disassociated, Amazon will incur charges for each running hour of an unused Elastic IP for as long as that Elastic IP remains disassociated. For example, terminating an instance will disassociate an elastic IP, unless that elastic IP is re-associated or released, the above applies. Stopping your instance entirely then starting it at a later time will also disassociate an elastic IP.
You can have a maximum of one elastic IP per an instance, any more and this will incur charges.
For those interested, you can find more information Elastic IP's pricing here under Elastic IP Addresses.
As of Riak 1.3, riak-admin reip is deprecated and the use of riak-admin cluster replace is the recomended way of replacing a cluster's name.
These are the commands I had to issue:
riak stop # stop the node
riak-admin down riak@127.0.0.1 # take it down
sudo rm -rf /var/lib/riak/ring/* # delete the riak ring
sudo sed -i "s/127.0.0.1/`hostname -i`/g" /etc/riak/vm.args # Change the name in config
riak-admin cluster force-replace riak@127.0.0.1 riak@"`hostname -i`" # replace the name
riak start # start the node
That should set the node's name to riak@[your EC2 internal IP address].
As well as changing the PB and HTTP IP's in the app.config, and the vm.args IP I also had to run:
http://docs.basho.com/riak/1.2.0/references/Command-Line-Tools---riak-admin/#reip
Without doing this, running riak console and looking in the output, the old IP is still present in the error log.