Replace commas except those in quotation marks

Date,Time,Ref,Sen,ATN,Flow,PCB temp,Status,Battery,BC
2015/04/23,12:30:00,779581,908043,"-15,254",49,31,0,100,
2015/04/23,12:35:00,778715,907084,"-15,259",49,31,0,100,-127
2015/04/23,12:40:00,778299,906419,"-15,239",49,32,0,100,461
(...)

Hi, I have an ascii file like the one above where I am trying to replace the commas for semicolon. This is the code I am using:

filein = open('Prueba1.txt')
fileout = open('Fin.txt', 'wt')
for line in filein:
    if line.startswith('20'):
        fileout.write( line.replace(',', ';') )
filein.close()
fileout.close()

The problem is that now I want to maintain the commas for the 5th column and also take of the quotation marks. Any ideas?

标签： python replace readfile

3条回答

混吃等死

2楼-- · 2019-08-13 15:26

Have you considered saving this as a csv file.

after saving it as csv file:

import pandas as pd
v = pd.read_csv("check_delete.csv")

output:

      Date            Time  Ref     Sen      ATN      Flow  PCB temp    Status  Battery BC
0   2015/04/23  12:30:00    779581  908043  -15,254 49  31  0   100   NaN
1   2015/04/23  12:35:00    778715  907084  -15,259 49  31  0   100   -127.0
2   2015/04/23  12:40:00    778299  906419  -15,239 49  32  0   100    461.0

You can read this as a dataframe.

0人赞添加讨论(0) 举报

干净又极端

3楼-- · 2019-08-13 15:33

This is an answer making use of the split() method if you do not want to use .csv files.

Let's consider the line to be:

line = '2015/04/23,12:35:00,778715,907084,"-15,259",49,31,0,100,-127'

First, split the line into three parts. One before the part in the quotes, the quoted part, and the part after the part in quotes. This can be done by line.split('"').

This will give us a list consisting of these three parts:

alist = ['2015/04/23,12:35:00,778715,907084,', '-15,259', ',49,31,0,100,-127']

Now, split the first and last elements of the list by the comma. And add all the elements to an empty string. Write this string to your new file.

Like this:

left_part = alist[0].split(',')
right_part = alist[2].split(',')
middle_part = alist[1]
final_list = left_part + [middle_part] + right_part
new_line = ''
for part in final_list:
    # to prevent the empty strings to be added add the if condition.
    if part:
        new_line += part + ';'

The final result should look like this:

'2015/04/23;12:35:00;778715;907084;-15,259;49;31;0;100;-127;'

0人赞添加讨论(0) 举报

虎瘦雄心在

4楼-- · 2019-08-13 15:38

A solution using the csv Python standard library:

import csv

with open('example.csv', newline='') as inputfile, \
     open('parsedcsv.csv', 'w',  newline='') as outpufile:

    datareader = csv.reader(inputfile, delimiter=',', quotechar='"')

    csvwriter = csv.writer(outpufile, delimiter=';',
            quotechar="'", quoting=csv.QUOTE_MINIMAL)

    csvwriter.writerows(datareader)

Quote from the documentation about the QUOTE_MINIMAL option:

csv.QUOTE_MINIMAL: Instructs writer objects to only quote those fields which contain special characters such as delimiter, quotechar or any of the characters in lineterminator.

The output file is:

Date;Time;Ref;Sen;ATN;Flow;PCB temp;Status;Battery;BC
2015/04/23;12:30:00;779581;908043;-15,254;49;31;0;130;
2015/04/23;12:35:00;778715;907084;-15,259;49;31;0;100;-127
2015/04/23;12:40:00;778299;906419;-15,239;49;32;0;100;461

0人赞添加讨论(0) 举报

Replace commas except those in quotation marks

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间