可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
In Python, I have a file which the words are separated by |
, for example: city|state|zipcode
. My file reader is unable to separate the words. Also, I want my file reader to start on line 2 rather than line 1. How do I get my file reader to separate the words?
import os
import sys
def file_reader(path, num_fields, seperator = ',', header = False):
try:
fp = open(path, "r", encoding="utf-8")
except FileNotFoundError:
raise FileNotFoundError("Unable to open file.")
else:
with fp:
for n, line in enumerate(fp, 1):
fields = line.rstrip('/n').split(seperator)
if len(fields) != num_fields:
raise ValueError("Unable to read file.")
elif n == 1 and header:
continue
else:
yield tuple([f.strip() for f in fields])
回答1:
If you use [1:-1]
(I think) you can select a sub array which starts after the first value of the array, which in the case of a file, should mean you get every line except the first.
回答2:
if you need to read from second line you can change your code from: for n, line in enumerate(fp, 1)
to for n, line in enumerate(fp[1:], 1)
回答3:
If you want an ultra shoddy ++ option to skip enumerating the first value: make a boolean value initialised to true, and then add an if statement at the start of your for loop which tests if this boolean value is true. Inside this if
statement, set the value to false, and then pass a continue
Something like:
b = True
for k, v in enumerator:
if b:
b = False
continue
# Some code
回答4:
In order to achieve what you request, the function is fine, and it is important to call it with the correct arguments, and make them different from default.
From the code, the default behavior is to use ,
as a separator, and to not skip the first line of the file. In order to actually split with |
and skip the first line (i.e. a header), then we will set seperator='|'
and header = True
when we call it.
# Function is fine, leave as-is
#
def file_reader(path, num_fields, seperator = ',', header = False):
try:
fp = open(path, "r", encoding="utf-8")
except FileNotFoundError:
raise FileNotFoundError("Unable to open file.")
else:
with fp:
for n, line in enumerate(fp, 1):
fields = line.rstrip('/n').split(seperator)
if len(fields) != num_fields:
raise ValueError("Unable to read file.")
elif n == 1 and header:
continue
else:
yield tuple([f.strip() for f in fields])
# Example file afile.txt contains these lines:
# alfa|beta|gamma|delta
# 1|2|3|4
# a|b|c|d
# here we call the function:
filename = 'afile.txt'
for x in file_reader(filename, 4, '|', True): #note the separator and header
print(x)
回答5:
We will divide the work into 3 steps reading the file, store each line of the file in a list, separate the list
Reading File
in python you can easily read a file using 'open' command as follows:
fp=open("file.txt",'r')
Reading each line separately
to read the file as lines you can use 'readlines' command as follows:
lines=fp.readline():
this will return the content of the file as a list, in which each record represent a line. You can also read a specific line by passing the number of the line fp.readline(5)
--> For more info check reading files in python
Separating the Content
To separate the Strings by '|' use the 'split' method:
for item in lines:
res=item.split('|')
#do what you want with res
回答6:
If you don't mind to use existing framework, you can use pandas. You can skip first row using skiprows=1 and change the separator using sep='|'
# load pandas
import pandas as pd
# read file as pandas dataframe
dataframe = pd.read_csv(file,skiprows=1,sep='|')
print(dataframe)
To install pandas
pip install pandas
Pandas documentation for read_csv
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html
Other option is to use csv reader to read your psv file
import csv
with open('file.psv') as csv_file:
csv_reader = csv.reader(csv_file, delimiter='|')
next(csv_reader, None) # read once to skip the header once
for row in csv_reader:
print(row)