How to dump/backup a Solr index to a file?

2019-04-04 18:44发布

问题:

I'm running a Virtual Private Server where, every day at midnight, all files are backed up automatically by the VPS provider.

So I need to export the Solr index to a file, so that if something goes wrong someday, I'll be able to import it back to Solr with ease.

How can I do this?

回答1:

The Solr database IS a (or a couple of) file(s). There is a folder that looks something like this:

root@vs210044:/home/solr/apache-solr-1.4.0/example/solr/data/index# ls
segments.gen  _xzy.tii     _y26.tii     _y4f.tii     _y6o.tii    _y8n.tii  _y9i.tis  _y9k.fdt  _y9l.fdx  _y9m.fnm
segments_uud  _xzy.tis     _y26.tis     _y4f.tis     _y6o.tis    _y8n.tis  _y9j.fdt  _y9k.fdx  _y9l.fnm  _y9m.frq
_xzy_2n.del   _y26_20.del  _y4f_1z.del  _y6o_21.del  _y8n_2.del  _y9i.fdt  _y9j.fdx  _y9k.fnm  _y9l.frq  _y9m.nrm
_xzy.fdt      _y26.fdt     _y4f.fdt     _y6o.fdt     _y8n.fdt    _y9i.fdx  _y9j.fnm  _y9k.frq  _y9l.nrm  _y9m.prx
_xzy.fdx      _y26.fdx     _y4f.fdx     _y6o.fdx     _y8n.fdx    _y9i.fnm  _y9j.frq  _y9k.nrm  _y9l.prx  _y9m.tii
_xzy.fnm      _y26.fnm     _y4f.fnm     _y6o.fnm     _y8n.fnm    _y9i.frq  _y9j.nrm  _y9k.prx  _y9l.tii  _y9m.tis
_xzy.frq      _y26.frq     _y4f.frq     _y6o.frq     _y8n.frq    _y9i.nrm  _y9j.prx  _y9k.tii  _y9l.tis
_xzy.nrm      _y26.nrm     _y4f.nrm     _y6o.nrm     _y8n.nrm    _y9i.prx  _y9j.tii  _y9k.tis  _y9m.fdt
_xzy.prx      _y26.prx     _y4f.prx     _y6o.prx     _y8n.prx    _y9i.tii  _y9j.tis  _y9l.fdt  _y9m.fdx

HOWEVER: it would suffice to save this folder. you can as well just backup your entire solr isntallation using incremental rsync or whatever... once started again only caches would need to be filled up newly etc.

BUT: i hope solr is not your primary database? its meant to be a search engine and not a replacement for a database and not even a backup! just like mysql replications are nice to do load balancing but are useless as a backup... why? because with the same query you could end up with an empty index. its just the same with solr/lucene. ... or for many, many other reasons that have far more brilliant people discussed already.

keeping that in mind i wish you a good day!



回答2:

Please see my other answer about taking hot backups using Solr's ReplicationHandler. You can just wget a URL and Solr will safely snapshot your data directory. I would not take a snapshot using cp.



回答3:

If you are concerned about keeping incremental states, there are a number of shell scripts that can be configured to run, either scheduled via cron or after commits and optimizes.

Find out more at http://wiki.apache.org/solr/SolrOperationsTools

One thing I would note is that while Solr is probably typically not used as the primary "System of Record", but as an auxiliary to some other data store, there isn't anything that requires that!

There are many use cases where if you lost your Solr indexes then you would lose your data. Think a site that crawls the internet for specific data. The only copy of each crawl result might only be in Solr, and I think, with appropriate backups, that is okay!



标签: solr backup