Combining columns of multiple files in one file -

2019-07-23 17:24发布

I have several hundred text files that contain a lot of information. Each file has 3 columns (the first two are the same for all the files). I need to merge the third column of all the files in a new file. And insert a column header with the name of the file from where the column belongs.

The txt files that have the three columns like this:

-118.33333333333279 40.041666666667908 11.409999847412109
-118.29166666666612 40.041666666667908 11.090000152587891
-118.24999999999946 40.041666666667908 10.920000076293945
-118.20833333333279 40.041666666667908 10.949999809265137

The txt file I am trying to create should look like this:

Name_of_file_1 Name_of_file_2 Name_of_file_3
3rd_Column_File_1 3rd_Column_File_2 3rd_Column_File_3
3rd_Column_File_1 3rd_Column_File_2 3rd_Column_File_3
3rd_Column_File_1 3rd_Column_File_2 3rd_Column_File_3
3rd_Column_File_1 3rd_Column_File_2 3rd_Column_File_3

Is this possible? I can't find a way to do so. Please help!!!

Pepo

2条回答
Animai°情兽
2楼-- · 2019-07-23 17:35

I would use unix tools for this:

mkfifo pipe1
mkfifo pipe2
mkfifo pipe3

cut -d " " -f 3 text1.csv > pipe1 &
cut -d " " -f 3 text2.csv > pipe2 &
cut -d " " -f 3 text3.csv > pipe3 &

paste pipe1 pipe2 pipe3 > final.csv

rm pipe1 pipe2 pipe3

Links to the used tools:

You can use the above code example to develop you own shell script.

查看更多
祖国的老花朵
3楼-- · 2019-07-23 17:50

This is one way to do it. Comments on the code in-line:

import csv

# List of your files
file_names = ['file1', 'file2']

# Output list of generator objects
o_data = []

# Open files in the succession and 
# store the file_name as the first
# element followed by the elements of
# the third column.
for afile in file_names:
    file_h = open(afile)
    a_list = []
    a_list.append(afile)
    csv_reader = csv.reader(file_h, delimiter=' ')
    for row in csv_reader:
        a_list.append(row[2])
    # Convert the list to a generator object
    o_data.append((n for n in a_list))
    file_h.close()

# Use zip and csv writer to iterate
# through the generator objects and 
# write out to the output file
with open('output', 'w') as op_file:
    csv_writer = csv.writer(op_file, delimiter=' ')
    for row in list(zip(*o_data)):
        csv_writer.writerow(row)
op_file.close()
查看更多
登录 后发表回答