Python - How to make sure that a line being read f

2019-09-10 09:55发布

In order to make sure I start and stop reading a text file exactly where I want to, I am providing 'start1'<->'end1', 'start2'<->'end2' as tags in between the text file and providing that to my python script. In my script I read it as:

start_end = ['start1','end1']
line_num = []
        with open(file_path) as fp1:
            for num, line in enumerate(fp1, 1):
                for i in start_end:
                    if i in line:
                        line_num.append(num)
        fp1.close()
        print '\nLine number: ', line_num
        fp2 = open(file_path)
        for k, line2 in enumerate(fp2): 
            for x in range(line_num[0], line_num[1] - 1):
                if k == x:
                    header.append(line2)
        fp2.close()

This works well until I reach start10 <-> end10 and further. Eg. it checks if I have "start2" in the line and also reads the text that has "start21" and similarly for end tag as well. so providing "start1, end1" as input also reads "start10, end10". If I replace the line:

if i in line:

with

if i == line:

it throws an error.

How can I make sure that the script reads the line that contains ONLY "start1" and not "start10"?

标签: python file
5条回答
Bombasti
2楼-- · 2019-09-10 10:37

Since your markers are always at the end of the line, change:

start_end = ['start1','end1']

to:

start_end = ['start1\n','end1\n']
查看更多
Anthone
3楼-- · 2019-09-10 10:49

You probably want to look into regular expressions. The Python re library has some good regex tools. It would let you define a string to compare your line to and it has the ability to check for start and end of lines.

查看更多
迷人小祖宗
4楼-- · 2019-09-10 10:50

You can do this with find():

for num, line in enumerate(fp1, 1):
    for i in start_end:
        if i in line:
            # make sure the next char isn't '0'
            if line[line.find(i)+len(i)] != '0':
                line_num.append(num)
查看更多
老娘就宠你
5楼-- · 2019-09-10 10:50
import re
prog = re.compile('start1$')
if prog.match(line):
   print line

That should return None if there is no match and return a regex match object if the line matches the compiled regex. The '$' at the end of the regex says that's the end of the line, so 'start1' works but 'start10' doesn't.

or another way..

def test(line):
   import re
   prog = re.compile('start1$')
   return prog.match(line) != None
> test('start1')
True
> test('start10')
False
查看更多
forever°为你锁心
6楼-- · 2019-09-10 10:54

If you can control the input file, consider adding an underscore (or any non-number character) to the end of each tag.

'start1_'<->'end1_'

'start10_'<->'end10_'

The regular expression solution presented in other answers is more elegant, but requires using regular expressions.

查看更多
登录 后发表回答