Remove first N lines of a file in place in unix co

2019-01-04 13:20发布

I'm trying to remove the first 37 lines from a very, very large file. I started trying sed and awk, but they seem to require copying the data to a new file. I'm looking for a "remove lines in place" method, that unlike sed -i is not making copies of any kind, but rather is just removing lines from the existing file.

Here's what I've done...

awk 'NR > 37' file.xml > 'f2.xml'
sed -i '1,37d' file.xml

Both of these seem to do a full copy. Is there any other simple CLI that can do this quickly without a full document traversal?

4条回答
地球回转人心会变
2楼-- · 2019-01-04 13:36

The copy will have to be created at some point - why not at the time of reading the "modified" file; streaming the altered copy instead of storing it?

What I'm thinking - create a named pipe "file2" that is the output of that same awk 'NR > 37' file.xml or whatever; then whoever reads file2 will not see the first 37 lines.

The drawback is that it will run awk each time the file is processed, so it's feasible only if it's read rarely.

查看更多
趁早两清
3楼-- · 2019-01-04 13:40

There's no simple way to do inplace editing using UNIX utilities, but here's one inplace file modification solution that you might be able to modify to work for you (courtesy of Robert Bonomi at https://groups.google.com/forum/#!topic/comp.unix.shell/5PRRZIP0v64):

count=$(head -37 "$file" |wc -c)
dd if="$file" bs="$count" skip=1 of="$file"

The final file should be $count bytes smaller than the original (since the goal was to remove $count bytes from the beginning), so to finish we must remove the final $count bytes. On a GNU system such as Linux this can be accomplished by:

truncate -s "-$count" "$file"

See the google groups thread I referenced for other suggestions and info.

查看更多
成全新的幸福
4楼-- · 2019-01-04 13:42

is the standard editor:

ed -s file <<< $'1,37d\nwq'
查看更多
在下西门庆
5楼-- · 2019-01-04 13:52

Unix file semantics do not allow truncating the front part of a file.

All solutions will be based on either:

  1. Reading the file into memory and then writing it back (ed, ex, other editors). This should be fine if your file is <1GB or if you have plenty of RAM.
  2. Writing a second copy and optionally replacing the original (sed -i, awk/tail > foo). This is fine as long as you have enough free diskspace for a copy, and don't mind the wait.

If the file is too large for any of these to work for you, you may be able to work around it depending on what's reading your file.

Perhaps your reader skips comments or blank lines? If so, you can then craft a message the reader ignores, make sure it has the same number of bytes as the 37 first lines in your file, and overwrite the start of the file with dd if=yourdata of=file conv=notrunc.

查看更多
登录 后发表回答