how to delete the duplicate lines in file except t

In the following configuration file

/etc/fine-tune.conf

We have duplicate lines as

clean_history_in_os=true

we want to delete all the lines that include clean_history_in_os=true except the first matched line in the file

what I did until now is that

  sed  -i '/clean_history_in_os=true/d' /etc/fine-tune.conf

but the problem is that sed delete all "clean_history_in_os=true" lines

I will happy to get ideas to solve this issue ,

标签： bash shell perl sed

2条回答

等我变得足够好

2楼-- · 2019-03-01 14:31

You can use this awk to delete all matching lines except the first one:

awk '!(/clean_history_in_os=true/ && n++)' file

To save file in place you can use this gnu awk command:

awk -i inplace '!(/clean_history_in_os=true/ && n++)' file

otherwise use temporary file as:

awk '!(/clean_history_in_os=true/ && n++)' file > $$.tmp && mv $$.tmp file

Here is one sed solution to do the same:

sed -i -n '0,/clean_history_in_os=true/p;/clean_history_in_os=true/!p' file

0人赞添加讨论(0) 举报

走好不送

3楼-- · 2019-03-01 14:39

With Perl

perl -i -ne'next if /clean_history_in_os=true/ && ++$ok > 1; print' file

This increments the counter when on that line and if > 1 it skips the line, otherwise prints

The question came up of how to pass the pattern to Perl if we have it as a shell variable. Below I assume that the shell variable $VAR contains the string clean_history...

In all of this a shell variable is directly used as a pattern in a regex. If it's the literal string from the question then the code below goes as given. However, if there may be special characters they should be escaped; so you may want to precede the pattern with \Q when used in regex. As a general note, one should take care to not use input from the shell to run code (say under /e).

Pass it as an argument, which is then available in @ARGV
```
perl -i -ne'
    BEGIN { $qr=shift; }; 
    next if /$qr/ && +$ok > 1; print
' "$VAR" file
```
where the BEGIN block runs in the BEGIN phase, before runtime (so not for the following iterations). In it shift removes the first element from @ARGV, which in the above invocation is the value in $VAR, first interpolated by shell. Then the filename file remains in @ARGV, so available for processing under -n (file is opened and its lines iterated over)
Use the -s switch, which enables command-line switches for the program
```
perl -i -s -ne'next if /$qr/ && +$ok > 1; print' -- -qr="$VAR" file
```
The -- (after the one-line program under '') marks the start of arguments for the program; then -qr introduces a variable $qr into the program, with a value assigned to it as above (with just -qr the variable $qr gets value 1, so is a flag).

Any such options must come before possible filenames, and they are removed from @ARGV so the program can then normally process the submitted files.
Export the bash variable, making it an environment variable which can then be accessed in the Perl program via %ENV hash
```
export $VAR="clean_history..."
perl -i -ne'next if /$ENV{VAR}/ && +$ok > 1; print' file
```
But I would rather recommend either of the first two options, over this one.

A refinement of the question given in a comment specifies that if the phrase clean_... starts with a # then that line should be skipped altogether. It's simplest to separately test for that

next if /#$qr/; next if /$qr/ && +$ok > 1; print

or, relying on short-circuiting

next if /#$qr/ || (/$qr/ && +$ok > 1); print

The first version is less error prone and probably cleerer.

0人赞添加讨论(0) 举报

how to delete the duplicate lines in file except t

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间