Use SED to delete certain lines using an index wit

2020-03-27 04:01发布


I get a big file, call it file.txt, which may have 20000 lines or more. Some of those lines have to be removed from the original file, and a new file containing the remaining lines has to be created, like newfile.txt. The lines to be deleted are in another file, like index.txt. So what I is something like:





I've been trying to use sed, trying to get it to use the numbers in the index to delete those lines, with something like:

for i in ${index.txt[@]}
    sed -i.back '${i}d' file.txt>newfile.txt

However, I get an error saying ${index.txt[@]}: bad substitution , and I have no idea how to fix this.

I've also tried to use gawk, but there was something wrong with the code, I think it had to do with the fact that the file is indented with tabs. If anyone could help I'd greatly appreciate it.


Do not call sed in a loop, that will be very slow.

You could transform the index file into a sed script, then call sed once on the data file:

sed -i.bak "$(sed 's/$/d/' index.txt)" file.txt

Or, as @Hazzard17 points out, ignore lines that don't contain just digits:

script=$(sed -n '/^[[:blank:]]*[[:digit:]]\+[[:blank:]]*$/ s/$/d/p' index.txt)
sed -i.bak "$script" file.txt

a demo:

$ seq 20000 | sed 's/^/line/' > file.txt
$ wc file.txt
 20000  20000 188894 file.txt
$ seq 20000 | while read n; do [[ $RANDOM -le 5000 ]] && echo $n; done > index.txt
$ wc index.txt
 3083  3083 16789 index.txt
$ sed -i.bak "$(sed 's/$/d/' index.txt)" file.txt
$ wc -l file.txt{,.bak}
 16917 file.txt
 20000 file.txt.bak
 36917 total

To read a file into an array, you can do:

mapfile -t indices < index.txt
for i in "${indices[@]}"; do ...; done

or just iterate over the file

while IFS= read -r i; do ...; done < index.txt


Following awk may help you here.

awk 'FNR==NR{a[$0];next} !(FNR in a)' index.txt file1.txt

Considering that your file1.txt file is having line number which we need to delete from file1.txt. Also append > temp_file && mv temp_file file1.txt in case you want to save the output into Input_file(file1.txt) here itself.


Here is a solution that does not modify your index.txt and will output the results into newfile.txt:

#replace new lines in the file with "d;"
#After this, linenumbers will contain "11d;56d;79d;..."
linenumbers=$(tr '\n' ';' < index.txt | sed 's/;/d;/g') 

#write file.txt with specified line numbers removed to newfile.txt
sed -e "$linenumbers" file.txt > newfile.txt