Use SED to delete certain lines using an index wit

I get a big file, call it file.txt, which may have 20000 lines or more. Some of those lines have to be removed from the original file, and a new file containing the remaining lines has to be created, like newfile.txt. The lines to be deleted are in another file, like index.txt. So what I is something like:

file.txt:

line1
line2
...
line19999
line20000

index.txt

I've been trying to use sed, trying to get it to use the numbers in the index to delete those lines, with something like:

for i in ${index.txt[@]}
do
    sed -i.back '${i}d' file.txt>newfile.txt
done

However, I get an error saying ${index.txt[@]}: bad substitution , and I have no idea how to fix this.

I've also tried to use gawk, but there was something wrong with the code, I think it had to do with the fact that the file is indented with tabs. If anyone could help I'd greatly appreciate it.

标签： linux bash awk sed grep

3条回答

戒情不戒烟

2楼-- · 2020-03-27 04:22

Here is a solution that does not modify your index.txt and will output the results into newfile.txt:

#replace new lines in the file with "d;"
#After this, linenumbers will contain "11d;56d;79d;..."
linenumbers=$(tr '\n' ';' < index.txt | sed 's/;/d;/g') 

#write file.txt with specified line numbers removed to newfile.txt
sed -e "$linenumbers" file.txt > newfile.txt

0人赞添加讨论(0) 举报

叛逆

3楼-- · 2020-03-27 04:27

Following awk may help you here.

awk 'FNR==NR{a[$0];next} !(FNR in a)' index.txt file1.txt

Considering that your file1.txt file is having line number which we need to delete from file1.txt. Also append > temp_file && mv temp_file file1.txt in case you want to save the output into Input_file(file1.txt) here itself.

0人赞添加讨论(0) 举报

smile是对你的礼貌

4楼-- · 2020-03-27 04:45

Do not call sed in a loop, that will be very slow.

You could transform the index file into a sed script, then call sed once on the data file:

sed -i.bak "$(sed 's/$/d/' index.txt)" file.txt

Or, as @Hazzard17 points out, ignore lines that don't contain just digits:

script=$(sed -n '/^[[:blank:]]*[[:digit:]]\+[[:blank:]]*$/ s/$/d/p' index.txt)
sed -i.bak "$script" file.txt

a demo:

$ seq 20000 | sed 's/^/line/' > file.txt
$ wc file.txt
 20000  20000 188894 file.txt
$ seq 20000 | while read n; do [[ $RANDOM -le 5000 ]] && echo $n; done > index.txt
$ wc index.txt
 3083  3083 16789 index.txt
$ sed -i.bak "$(sed 's/$/d/' index.txt)" file.txt
$ wc -l file.txt{,.bak}
 16917 file.txt
 20000 file.txt.bak
 36917 total

To read a file into an array, you can do:

mapfile -t indices < index.txt
for i in "${indices[@]}"; do ...; done

or just iterate over the file

while IFS= read -r i; do ...; done < index.txt

0人赞添加讨论(0) 举报

Use SED to delete certain lines using an index wit

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间