可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 6 years ago.
I have a file containing some lines of code followed by a string pattern. I need to write everything before the line containing the string pattern in file one and everything after the string pattern in file two:
e.g. (file-content)
- codeline 1
- codeline 2
- string pattern
- codeline 3
The output should be file one with codeline 1, codeline 2 and file two with codeline 3.
I am familiar with writing files, but unfortunately I do not know how to determine the content before and after the string pattern.
回答1:
If the input file fits into memory, the easiest solution is to use str.partition()
:
with open("inputfile") as f:
contents1, sentinel, contents2 = f.read().partition("Sentinel text\n")
with open("outputfile1", "w") as f:
f.write(contents1)
with open("outputfile2", "w") as f:
f.write(contents2)
This assumes that you know the exact text of the line separating the two parts.
回答2:
This approach is similar to Lev's but uses itertools
because it's fun.
dont_break = lambda l: l.strip() != 'string_pattern'
with open('input') as source:
with open('out_1', 'w') as out1:
out1.writelines(itertools.takewhile(dont_break, source))
with open('out_2', 'w') as out2:
out2.writelines(source)
You could replace the dont_break function with a regular expression or anything else if necessary.
回答3:
with open('data.txt') as inf, open('out1.txt','w') as of1, open('out2.txt','w') as of2:
outf = of1
for line in inf:
if 'string pattern' in line:
outf = of2
continue # prevent output of the line with "string pattern"
outf.write(line)
will work with large files since it works line by line. Assumes string pattern
occurs only once in the input file. I like the str.partition()
approach best if the whole file can fit into memory (which may not be a problem)
Using with
ensures the files are automatically closed when you are done, or an exception is encountered.
回答4:
A more efficient answer which will handle large files and consume a limited amount of memory..
inp = open('inputfile')
out = open('outfile1', 'w')
for line in inp:
if line == "Sentinel text\n":
out.close()
out = open('outfile2', 'w')
else:
out.write(line)
out.close()
inp.close()
回答5:
A naive example (that doesn't load the file into memory like Sven's):
with open('file', 'r') as r:
with open('file1', 'w') as f:
for line in r:
if line == 'string pattern\n':
break
f.write(line)
with open('file2', 'w') as f:
for line in r:
f.write(line)
This assumes that 'string pattern'
occurs once in the input file.
If the pattern isn't a fixed string, you can use the re
module.
回答6:
No more than three lines:
with open('infile') as fp, open('of1','w') as of1, open('of2','w') as of2:
of1.writelines(iter(fp.readline, sentinel))
of2.writelines(fp)
回答7:
You need something like:
def test_pattern(x):
if x.startswith('abc'): # replace this with some exact test
return True
return False
found = False
out = open('outfile1', 'w')
for line in open('inputfile'):
if not found and test_pattern(line):
found = True
out.close()
out = open('outfile2', 'w')
out.write(line)
out.close()
replace the line with startswith with a test that works on your pattern (using pattern matching from re if necessary, but anything that finds the devider line will do).