There had been a lot of discussions about the core.autocrlf and core.safecrlf features in the current release and the next release. The question i have here relates to an environment where developers clone from a bare repository.
During the clone the autocrlf settings are enabled. But since the developers has full control on their clone, they can remove this autocrlf setting and proceed.
We can specify files other than binary in the .gitattributes file but is there any other way GIT automatically determine if a file is a text file or binary file?
Is there a way like an update hook (commit hook is not possible as developers can still remove it) that can be placed to make sure, the files (with CRLF) being pushed from a windows environment to a UNIX machine hosting the bare repo, is converted to UNIX EOL format (LF)?
Will having such update hooks that scans each file for CRLF affect performance of a push operation?
Thanks
1/ Git itself has an heuristic to determine if a file is binary or text (similar to istext)
2/ gergap weblog had recently (may 2010) the same idea.
See his update hook here (reproduced at the end of this answer), but the trick is:
Rather than trying to convert, the hook will simply reject the push if it detects an (supposedly) non-binary file with improper eol style.
Git converts LF->CRLF
when checking out on Windows.
If the file contains already CRLF
, Git is clever enough to detect that and does not expand it to CRCRLF
what would be wrong. It keeps the CRLF
, which means the file was implicitly changed locally during the checkout, because when committing it again, the wrong CRLF
will be corrected to LF
. That’s why GIT must mark these files as modified.
It’s good to understand the problem, but we need a solution that prevents that wrong line endi- ngs are pushed to the central repo.
The solution is to install an update hook on the central server.
- 3/ There will be a small cost, but unless you push every 30 seconds, this shouldn't be an issue.
Plus there is no actual conversion taking place: it the file is not correct, the push gets rejected.
That places the conversion issue right back where it should belong: on the developer side.
#!/bin/sh
#
# Author: Gerhard Gappmeier, ascolab GmbH
# This script is based on the update.sample in git/contrib/hooks.
# You are free to use this script for whatever you want.
#
# To enable this hook, rename this file to "update".
#
# --- Command line
refname="$1"
oldrev="$2"
newrev="$3"
#echo "COMMANDLINE: $*"
# --- Safety check
if [ -z "$GIT_DIR" ]; then
echo "Don't run this script from the command line." >&2
echo " (if you want, you could supply GIT_DIR then run" >&2
echo " $0 <ref> <oldrev> <newrev>)" >&2
exit 1
fi
if [ -z "$refname" -o -z "$oldrev" -o -z "$newrev" ]; then
echo "Usage: $0 <ref> <oldrev> <newrev>" >&2
exit 1
fi
BINARAY_EXT="pdb dll exe png gif jpg"
# returns 1 if the given filename is a binary file
function IsBinary()
{
result=0
for ext in $BINARAY_EXT; do
if [ "$ext" = "${1#*.}" ]; then
result=1
break
fi
done
return $result
}
# make temp paths
tmp=$(mktemp /tmp/git.update.XXXXXX)
log=$(mktemp /tmp/git.update.log.XXXXXX)
tree=$(mktemp /tmp/git.diff-tree.XXXXXX)
ret=0
git diff-tree -r "$oldrev" "$newrev" > $tree
#echo
#echo diff-tree:
#cat $tree
# read $tree using the file descriptors
exec 3<&0
exec 0<$tree
while read old_mode new_mode old_sha1 new_sha1 status name
do
# debug output
#echo "old_mode=$old_mode new_mode=$new_mode old_sha1=$old_sha1 new_sha1=$new_sha1 status=$status name=$name"
# skip lines showing parent commit
test -z "$new_sha1" && continue
# skip deletions
[ "$new_sha1" = "0000000000000000000000000000000000000000" ] && continue
# don't do a CRLF check for binary files
IsBinary $tmp
if [ $? -eq 1 ]; then
continue # skip binary files
fi
# check for CRLF
git cat-file blob $new_sha1 > $tmp
RESULT=`grep -Pl '\r\n' $tmp`
echo $RESULT
if [ "$RESULT" = "$tmp" ]; then
echo "###################################################################################################"
echo "# '$name' contains CRLF! Dear Windows developer, please activate the GIT core.autocrlf feature,"
echo "# or change the line endings to LF before trying to push."
echo "# Use 'git config core.autocrlf true' to activate CRLF conversion."
echo "# OR use 'git reset HEAD~1' to undo your last commit and fix the line endings."
echo "###################################################################################################"
ret=1
fi
done
exec 0<&3
# --- Finished
exit $ret