Someone on our server ran sed -i 's/$var >> $var2/$var > $var2/ *
to change inserts to overwrites in some bash scripts in a common directory. No big deal, it was tested first with grep
and it returned the expected results that only his files would be touched.
He ran the script and now 1200 files of the 1400 in the folder have a new modified date, yet as far as we can tell, only his small handful of files were actually changed.
- Why would sed 'touch' a file that it's not changing.
- Why would it only 'touch' a portion of the files and not all of them.
- Did it actually change something (maybe some trailing white space or something totally unexpected because of the
$
's in the sed regex)?
When GNU
sed
successfully edits a file "in-place," its timestamp is updated. To understand why, let's review how edit "in-place" is done:A temporary file is created to hold the output.
sed
processes the input file, sending output to the temporary file.If a backup file extension was specified, the input file is renamed to the backup file.
Whether a backup is created or not, the temporary output is moved (
rename
) to the input file.GNU
sed
does not track whether any changes were made to the file. Whatever is in the temporary output file is moved to the input file viarename
.There is a nice benefit to this procedure: POSIX requires that
rename
be atomic. Consequently, the input file is never in a mangled state: it is either the original file or the modified file and never part way in-between.As a result of this procedure, any file that
sed
successfully processes will have its timestamp changed.Example
Let's consider this
inputfile
:Now, under the supervision of
strace
, let's runsed -i
on it in a way guaranteed to cause no changes:The edited result looks like:
As you can see,
sed
opens the input file,inputfile
, on file handle 4. It then creates a temporary file,./sediWWqLI
on file handle 6, to hold the output. It reads from the input file and writes it unchanged to the output file. When this is done, a call torename
is made to overwriteinputfile
, changing its timestamp.GNU
sed
source codeThe relevant source code is in the
execute.c
file of thesed
directory of the source. From version 4.2.1:ck_rename
is a cover function for the stdio functionrename
. The source forck_rename
is insed/utils.c
.As you can see, no flag is kept to determine whether the file actually changed or not.
rename
is called regardless.Files whose timestamps were not updated
As for the 200 of the 1400 files whose timestamps did not change, that would mean that
sed
somehow failed on those files. One possibility would be a permissions issue.sed -i
and Symbolic LinksAs noted by mklement0, applying
sed -i
to a symbolic link leads to a surprising result.sed -i
does not update the file pointed to by the symbolic link. Instead,sed -i
overwrites the symbolic link with a new regular file.This is a result of the call that
sed
makes to the STDIOrename
. As documented byman 2 rename
:mklement0 reports that this is also true of the (BSD)
sed
on Mac OSX 10.10.I use the following workaround, i.e. look at each file separatedely, use grep to check if the file contains the string and then use sed. Not very nice, but works...