I know 1000s of similar topics floating around. I read at lest 5 threads here in SO But why am I still not convinced about DVCS?
I have only following questions (note that I am selfishly worried only about Java projects)
- What is the advantage or value of committing locally? What? really? All modern IDEs allows you to keep track of your changes? and if required you can restore a particular change. Also, they have a feature to label your changes/versions at IDE level!?
- what if I crash my hard drive? where did my local repository go? (so how is it cool compared to checking in to a central repo?)
- Working offline or in an air plane. What is the big deal?In order for me to build a release with my changes, I must eventually connect to the central repository. Till then it does not matter how I track my changes locally.
- Ok Linus Torvalds gives his life to Git and hates everything else. Is that enough to blindly sing praises? Linus lives in a different world compared to offshore developers in my mid-sized project?
Pitch me!
I'm not going to sell anything here.
The only real advantage is that you don't need to need connectivity to the main central repository. Someone can say that Git's benefit is the fact that a developer can commit locally, preparing a great combo of patches and then pull them to the blessed central repo, but IMO this is pretty uninteresting. A developer can use a private shelve or a branch in Subversion repository to work on his task and then merge it with a mainline (e.g. /trunk) or another branch.
For me the main downside here is the fact that I have to download and store the whole Git repository on my machine. With a large project with looong history it becomes a pain and takes too much space.
Another downside of being centralized is that Git technically can't track renames or copy operations. It just tries to guess whether a file was renamed or copied based on the file's content. This results into such funny cases: svn to git migration keeping history of copied file (Guy is asking why the history of a file has been lost after SVN>Git migration, ).
With Git, if you crashed your local storage device (HDD, SSD, whatever) and it had changes that were not pulled or pushed to a blessed Git's repo, then you are out of luck. You've just lost your time and your code. In addition to this, a crash of a hard drive with your local Git repo may halt development process for some time: Linus Torvald's SSD breaks, halts Linux kernel development.
With centralized source control such as SVN, you could only lose your last commit because all your work was already committed to the central repository to a branch, private shelve or even trunk. Obviously, you should ensure that there is a disaster recovery and backup implemented for your central repo.
For such a project as Linux Kernel that used BitKeeper in the past, Git is the best source control system! But I'd say that Git does not suit everyone.
Choose wisely!
I have been where you are now, sceptical of the uses of distributed version control. I had read all the articles and knew the theoretical arguments, but I was not convinced.
Until, one day, I typed
git init
and suddenly found myself inside a git repository.I suggest you do the same -- simply try it. Begin with a small hobby project, just to get the hang of it. Then decide if it's worth using for something larger.
DVCS is very interesting for me as it:
adds an all new dimension to the source control process: publication.
You do not just have a merge workflow, you also have a publication workflow (to which repository will you push to/pull from), and that can have many implication in term of:
brings a new way of producing/consuming revisions with:
That means you do not depend on other delivering their work to a central repo but that you can have a more direct relationship with different actors and their repos.
I'm a Mercurial developer and have worked as a Mercurial consultant. So I find your questions very interesting and hope I answer them:
You are correct that IDEs can track local changes beyond simple undo/redo these days. However, there is still a gap in functionality between these file snapshots and a full version control system.
The local commits give you the option of preparing your "story" locally before you submit it for review. I often work on some changes involving 2-5 commits. After I make commit 4, I might go back and amend commit 2 slightly (maybe I saw an error in commit 2 after I made commit 4). That way I'll be working not just on the latest code, but on the last couple of commits. That's trivially possible when everything is local, but it becomes more tricky if you need to sync with a central server.
Not cool at all! :-)
However, even with a central repo, you still have to worry about the uncommited data in the working copy. I would therefore claim that you ought to have a backup solution in place anyway.
It is my experience, that people often have larger chunks of uncommited data lying around in their working copies with a centralized system. Clients told me how they were trying to convince developers to commit at least once a week.
The changes are often left uncommited because:
They are not really finished. There might be debug print statements in the code, there might be incomplete functions, etc.
Committing would go into
trunk
and that is dangerous with a centralized system since it impacts everybody else.Committing would require you to first merge with the central repository. That merge might be intimidating if you know that there has been other conflicting changes made to the code. The merge might simply be annoying because you might not be all done with the changes and you prefer to work from a known-good state.
Committing can be slow when you have to talk to an overloaded central server. If you're in an offshore location, commits are even slower.
You are absolute correct if you think that the above isn't really a question of centralized versus distribted version control. With a CVCS, people can work in separate branches and thus trivially avoid 2 and 3 above. With a separate throw-away branch, I can also commit as much as I want since I can create another branch where I commit more polished changes (solving 1). Commits can still be slow, though, so 4 can apply still.
People who use DVCS will often push their "local" commits to a remote server anyway as poor man's backup solution. They don't push to the main server where the rest of the team is working, but to another (possibly private) server. That way they can work in isolation and still keep off-site backups.
Yeah, I never liked that argument either. I have good Internet connectivity 99% of the time and don't fly enough for this to be an issue :-)
However, the real argument is not that you are offline, but that you can pretend to be offline. More precisely, that you can work in isolation without having to send your changes to a central repository immediately.
DVCS tools are designed around the idea that people might be working offline. This has a number of important consequences:
Merging branches become a natural thing. When people can work in parallel, forks will naturally occur in the commit graph. These tools must therefore be really good at merging branches. A tool such a SVN is not very good at merging!
Git, Mercurial, and other DVCS tools merge better because they have had more testing in this area, not directly because they are distributed.
More flexibility. With a DVCS, you have the freedom to push/pull changes between arbitrary repositories. I'll often push/pull between my home and work computers, without using any real central server. When things are ready for publication, I push them to a place like Bitbucket.
Multi-site sync is no longer an "enterprise feature", it's a built-in feature. So if you have an off-shore location, they can setup a local hub repository and use this among themselves. You can then sync the local hubs hours, daily, or when it suits you. This requires nothing more than a cronjob that runs
hg pull
orgit fetch
at regular intervals.Better scalability since more logic is on the client-side. This means less maintenance on the central server, and more powerful client-side tools.
With a DVCS, I expect to be able to do a keyword search through revisions of the code (not just the commit messages). With a centralized tool, you normally need to setup an extra indexing tool.
It might be interesting to note that Subversion will probably be getting things like offline commits in the future. Of course we can't really compare those features to what's available today, but it might be a very good way to "use DVCS in a centralized manner" as described in other answers here.
Another recent post states that Subversion is not trying to become a DVCS
These things will probably mean that the repository is still centralized, meaning you can't do disconnected branching, diffing of old versions, but you can queue up commits.
If you don't see the value of local history or local builds, then I'm not sure than any amount of question-answering is going to change your mind.
The history features of IDE's are limited and clumsy. They are nothing like the full function.
One good example of how this stuff gets used is on various Apache projects. I can sync up a git repo to the Apache svn repo. Then I can work for a week in a private branch all my very own. I can downmerge changes from the repo. I can report on my changes, retail or wholesale. And when I'm done, I can package them up as one commit.