I have a group of erlang nodes that are replicating their data through Mnesia's "extra_db_nodes"... I need to upgrade hardware and software so I have to detach some nodes as I make my way from node to node.
How does one remove a node and still preserve the data that was inserted?
[update] removing nodes is as important as adding them. Over time as your cluster grows it must also contract. If not then Mnesia is going to be busy trying to send data to nonexistent nodes filling up queues and keeping the network busy.
[final update] after pouring through the erlang/mnesia source code I was able to determine that it is not possible to completely disassociate nodes. While del_table_copy removes the linkage between tables it is incomplete. I would close this question but none of the close descriptions are adequate.
I wish I had found this a long time ago: http://weblambdazero.blogspot.com/2008/08/erlang-tips-and-tricks-mnesia.html
basically, with a properly functioning cluster....
login to the cluster to be removed
stop mnesia
mnesia:stop().
login to a different node on the cluster
delete the schema
mnesia:del_table_copy(schema, node@host.domain).
I'm extremely late to the party, but came across this info in the doc when looking for a solution to the same problem:
"The function call
mnesia:del_table_copy(schema,
mynode@host) deletes the node
'mynode@host' from the Mnesia system.
The call fails if mnesia is running on
'mynode@host'. The other mnesia nodes
will never try to connect to that node
again. Note, if there is a disc
resident schema on the node
'mynode@host', the entire mnesia
directory should be deleted. This can
be done with mnesia:delete_schema/1.
If mnesia is started again on the the
node 'mynode@host' and the directory
has not been cleared, mnesia's
behaviour is undefined."
(http://www.erlang.org/doc/apps/mnesia/Mnesia_chap5.html#id74278)
I think the following might do what you desire:
AllTables = mnesia:system_info(tables),
DataTables = lists:filter(fun(Table) -> Table =/= schema end,
AllTables),
RemoveTableCopy = fun(Table,Node) ->
Nodes = mnesia:table_info(Table,ram_copies) ++
mnesia:table_info(Table,disc_copies) ++
mnesia:table_info(Table,disc_only_copies),
case lists:member(Node,Nodes) of
true -> mnesia:del_table_copy(Table,Node);
false -> ok
end
end,
[RemoveTableCopy(Tbl,'gone@gone_host') || Tbl <- DataTables].
rpc:call('gone@gone_host',mnesia,stop,[]),
rpc:call('gone@gone_host',mnesia,delete_schema,[SchemaDir]),
RemoveTablecopy(schema,'gone@gone_host').
Though, I haven't tested it since my scenario is slightly different.
I've certainly used this method to perform this (supporting the mnesia:del_table_copy/2 use). See removeNode/1 below:
-module(tool_bootstrap).
-export([bootstrapNewNode/1, closedownNode/0,
finalBootstrap/0, removeNode/1]).
-include_lib("records.hrl").
-include_lib("stdlib/include/qlc.hrl").
bootstrapNewNode(Node) ->
%% Make the given node part of the family and start the cloud on it
mnesia:change_config(extra_db_nodes, [Node]),
%% Now make the other node set things up
rpc:call(Node, tool_bootstrap, finalBootstrap, []).
removeNode(Node) ->
rpc:call(Node, tool_bootstrap, closedownNode, []),
mnesia:del_table_copy(schema, Node).
finalBootstrap() ->
%% Code removed to actually copy over my tables etc...
application:start(cloud).
closedownNode() ->
application:stop(cloud), mnesia:stop().
If you have replicated the table (added table copies) on nodes other than the one you're removing, then you're already fine - just remove the node.
If you wanted to be slightly tidier you'd delete the table copies from the node you're about to remove first via mnesia:del_table_copy/2
.
Generally, mnesia gracefully handles node loss and detects node rejoin (rebooted nodes obtain new table copies from nodes that kept running, nodes that didn't reboot are detected as a network partition event). Mnesia does not consume CPU or network traffic for nodes that have gone down. I think, though I haven't confirmed it in the source, mnesia won't reconnect to nodes that have gone down automatically - the node that goes down is expected to reboot (mnesia) and reconnect.
mnesia:add_table_copy/3
, mnesia:move_table_copy/3
and mnesia:del_table_copy/2
are the functions you should look at for live schema management.
The extra_db_nodes
parameter should only be used when initialising a new DB node - once a new node has a copy of the schema it doesn't need the extra_db_nodes
parameter.