Over the years, I've always stored binary dependencies in the \lib
folder and checked that into source-control with the rest of the project. I find I do this less so now that we have NuGet and NuGet Package Restore.
I've heard that some companies enforce a rule that no binaries can be checked into source control. The reasons cited include:
- Most VCS do not deal well with binaries - diffing and merging is not well supported
- Disk usage increases
- Commits and updates are slower
- The extra functionality, control and ease of use that a repository manager provides out of the box will be lost
- It encourages further bad practice; ideally projects should be looking to fully automate their builds, checking into version control is typically a manual step
Are there objective arguments for or against this practice for the vast majority of projects that use source-control?
Things depend on the workflow and the VCS used.
Using a component based workflow with SVN, you check in the includes and libs of the component. By this the libs and includes make the interface for other components. These only import the libs and includes using svn:externals while not importing the source code of the component at all. This enforces clean interfaces and a strict separation between the different components: A component is a black box that can only be used as specified in the interface. The internal structure is invisible to others. Using binaries reduces compile time and may reduce the number tools required on a machine for compiling since specialized tools that are required for creating a component need not be present when just using it.
However, using a distributed VCS things will not work this way. DVCS depend on cloning the whole repository. Checking in binaries the size of the repository will rapidly grow beyond a point where this will just take too long. While having SVN repositories of 100GB is not a problem since checkouts only deal with one revision which is smaller by several orders of magnitude, having a Git/Mercurial/Bazaar repository of that size would make it quite unusable since cloning would take ages.
So whether checking in binaries is a good idea or not depends on your workflow and also depends on the tools used.
My own rule of thumb is there generated assets should not be version controlled (regardless of whether they're binary or textual). There are several things like images, audio/video files etc. which might be checked in and for good reason.
As for the specific points.
You can't merge these kinds of files but they're usually just replaced rather than piecewise merged. Diffing them might be possible for some files using custom differs but in general, this is done using some kind of metadata like version numbers.
If you had a large text file, disk usage is not an argument against version control. Same here. The idea is that changes to this file need to be tracked. In the worst case, it's possible to put these assets in a separate repository (that doesn't change very often) and then include it in the current one using something git submodules.
This is simply not true. Operations on that specific file might be slower but that's okay. It would be the same for text files.
I think having things in version control increases the convenience provided by the repo. manager.
This touches on my point that the files in question shouldn't be generated. If the files are not generated, then checkout and build is one step. There's no "download binary assets" stage.
I would strongly recommend you to NOT use the practice that you describe (the practice of forbidding binaries in source-control). Actually I would call this an organizational anti-pattern.
The single most important rule is:
You should be able to check out a project on a new machine, and it has to compile out of the box.
If this can be done via NuGet, then fine so. If not, check in the binaries. If there are any legal/license issues, then you should have at least a text file (named
how_to_compile.txt
or similar) in your repo that contains all the required information.Another very strong reason to do it like this is to avoid versioning problems - or do you know
Some other arguments against the above: