可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have a text file that contains both \n and \r\n end-of-line markers. I want to split only on \r\n, but can't figure out a way to do this with python's readlines method. Is there a simple workaround for this?

回答1:

As @eskaev mentions, you'll usually want to avoid reading the complete file into memory if not necessary.

io.open() allows you to specify a newline keyword argument, so you can still iterate over lines and have them split only at the specified newlines:

import io

for line in io.open('in.txt', newline='\r\n'):
    print repr(line)

Output:

u'this\nis\nsome\r\n'
u'text\nwith\nnewlines.'

回答2:

Avoid reading it in text mode. Python reads texts files with universal newline support. This means that all line endings are interpreted as \n:

>>> with open('out', 'wb') as f:
...     f.write(b'a\nb\r\nc\r\nd\ne\r\nf')
... 
14
>>> with open('out', 'r') as f: f.readlines()
... 
['a\n', 'b\n', 'c\n', 'd\n', 'e\n', 'f']

Note that using U doesn't change the result¹:

>>> with open('out', 'rU') as f: f.readlines()
... 
['a\n', 'b\n', 'c\n', 'd\n', 'e\n', 'f']

However you can always read the file in binary mode, decode it, and then split on \r\n:

>>> with open('out', 'rb') as f: f.read().split(b'\r\n')
... 
[b'a\nb', b'c', b'd\ne', b'f']

(example in python3. You can decode the bytes into unicode either before or after the split).

you can avoid reading the whole file into memory and read it in blocks instead. However it becomes a bit mroe complex to correctly handle the lines (you have to manually check where the last line started and concatenate it to the following block).

¹ I believe it's because universal newline is enabled by default in all normal installations. You have to explicitly disable it when configuring the installation and then the r and rU mode would have different behaviours (the first would only split lines on the OS native line endings, the latter would produce the result shown above).

回答3:

Instead of using readline, just use read and the split.

For Example

with open('/path/to/file', 'r') as f:
    fileContents = f.read() #read entire file
    filePieces = fileContents.split('\r\n')

回答4:

This approach reads the file as a generator in chunks split by your separator.

ifs = open(myFile)
for chunk in ifs.read().split(mySep):
    #do something with the chunk

How to split only on carriage returns with readlin

问题:

回答1:

回答2:

回答3:

回答4:

收藏的人(0)

How to split only on carriage returns with readlin

问题:

回答1:

回答2:

回答3:

回答4:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮