I was just thinking of writing a shell script to implement the obliterate functionality in an easy to do way (externally, using the suggested way, but automated).
Here's what I had in mind:
On the client
svn list -R > file-list
.
- filter file-list in several ways like grep to create a file "files-to-delete", something like a set of
grep XXX file-list>>files-to-delete
.
- transfer
files-to-delete
to the server using scp.
On the server
- Dump the repository
svnadmin dump /path/to/repos > repos-dumpfile
, this can be kept as a backup too.
- Filter the dump file, for each word in "files-to-delete", do:
cat repos-dumpfile | svndumpfilter exclude $file > new-dumpfile
- Create a new repository and load the new file to it
svnadmin create new-name; svnadmin load new-name < new-dumpfile
Would this work? How can it fail? Any other ideas?
Yes, that script would work.
But usually you don't obliterate that many files. Usually obliterate is only needed if you commit confidential information accidentally.
Are you sure you want to use obliterate for so many files?
I think cat new-dumpfile | svndumpfilter exclude $file > new-dumpfile
is a dangerous example. new-dumpfile will not be completely processed and it's contents will be probably lost, no?
From the comments below: the new-dumpfile will surely be lost, because the shell will clobber (truncate to zero length) it even before starting up the command.
I had a similar but slightly more complex requirement. Several hundred revisions in the past, some very large (>1GB) sample data files were committed to the repository. They were then moved around and eventually deleted from HEAD. However they were still in revision history, making the repository cumbersomely large. I could not use svn list -R
, since the files no longer appeared in the working copy.
However, svn list
can be given a revision argument. I wasn't sure exactly when the big files had been checked in, but I knew it was sometime after revision 2000. I also had a list of file names. So I used a simple loop and uniq to generate my files-to-delete
:
cd $working_copy
for rev in {2000..2437}; do
svn ls -R -r$rev | grep -f ~/tmp/big-file-names >> ~/tmp/file-paths;
done
cat ~/tmp/file-paths | sort | uniq > ~/tmp/files-to-delete
cd ~/tmp
# You should inspect "files-to-delete" to see if it looks reasonable!
cat dumpfile | svndumpfilter exclude `cat files-to-delete` > dumpfile.new
What about files with the same path in different revisions? For example, if you commit /trunk/foo
, then rename it to /trunk/bar
, then commit something else at /trunk/foo
that you want to obliterate. You don't want to lose the history of what's now /trunk/bar
. Maybe svndumpfilter
supports peg revisions?