I've got a Git repository that has some files with DOS format (\r\n
line endings). I would like to just run the files through dos2unix
(which would change all files to UNIX format, with \n
line endings), but how badly would this affect history, and is it recommended at all?
I assume that the standard is to always use UNIX line endings for source-controlled files, and optionally switch to OS-specific line endings locally?
The approach you’ll have to use depends on how public your repository is.
If you don’t mind or care about changing all SHAs because you’re more or less the only one using it but want to have this issue sorted out for all times, you can run a git filter-branch
and apply dos2unix
to all files in each commit. (If you’re sharing the repository, everyone else needs more or less to completely renew it, so this is potentially dangerous.)
So the better option and also an easier way would be to change it only in the current heads. This means that your past commits still have \r\n
endings but unless you’re doing much cherry-picking from the past this should not be a problem. The diff tools might complain a bit more often, of course, but normally you’ll only diff with commits in the vicinity, so this issue resolves itself as the commits accumulate.
And UNIX line endings are standard, you’re correct about that. Best approach is to setup your editor to only write these endings even on windows. Otherwise, there is also a autocrlf
setting which you can use.
Addition to the history rewriting part:
Last time I did the same, I used the following command to change all files to unix endings.
#!/bin/bash
all2dos() { find * -exec dos2unix {} \; }
export -f all2dos
git filter-branch -f --tree-filter 'all2dos' --tag-name-filter cat --prune-empty -- --all
This crlf thing drove us crazy when we converted from svn to git (in a central (bare) like) scm environment. The thing that ultimately got us was we copied the global .gitconfig file to everyone's user root (yep both windows and linux) with the initial one coming from a Windows system and having core.autocrlf=true and core.safecrlf=false which played havoc on the linux users (like bash scripts didn't work and all those awful ^M's). So we initially did a checkout and clone script that did a dos2unix after these commands. Then I ran across the core.autocrlf and core.safecrlf config items and set them based on the O/S:
Windows: core.autocrlf=true and core.safecrlf=false
Linux: core.autocrlf=input and core.safecrlf=false
These were set with:
---on Windows---
git config --global core.autocrlf true
git config --global core.safecrlf false
---on Linux---
git config --global core.autocrlf input
git config --global core.safecrlf false
Then for our Linux developers we setup a little bash script /usr/local/bin/gitfixcrlf:
#!/bin/sh
# remove local tree
git ls-files -z | xargs -0 rm
# checkout with proper crlf
git checkout .
Which they only had to run on their local sandbox clones once. Any future cloning was done correctly. Any future push pulls now were handled correctly. So, this solved our multiple O/S issues with linefeeds. Also Note that Mac falls in the same config as Linux.
For the continuing solution, have a look at the core.autocrlf (and core.safecrlf) config parameters.
Doing this once to your whole repository will just create one commit that's pretty impossible to merge with (since every line in those files will be modified), but once you get past it, it should be no big deal. (Yes, you could use git filter-branch
to make the modification all the way through history, but that's a bit scary.)
If your list of version controlled files includes binaries, or you can't change history easily... here is a handy dandy one-liner:
https://unix.stackexchange.com/a/365679/112190