Parsing Text file and segregating the data in a Di

I have a kind of complex problem here in parsing a text file.

What I need:

Read through a text file.
If a line matches a specific condition, create a key named (condition 1)
Copy the lines that follow as a list. this list needs to be associated with key (Condition 1)
When the condition is encountered again, a new key and copy the lines following and repeat step 3 until the end of file

Problem: I am having trouble appending new items in the list for a given key

Sample Text Input file:

A1 letters characters jgjgjg
A2 letters numbers fgdhdhd
D1 letters numbers haksjshs
condition1, dhdjfjf
K2 letters characters jgjgjg
J1 alphas numbers fgdhdhd
L1 letters numbers haksjshs
condition2, dhdjfjf
J1 alphas numbers fgdhdhd
D1 letters numbers haksjshs
J1 alphas numbers fgdhdhd
D1 letters numbers haksjshs

Expected Dictionary:

dictone = {'condition1':['K2 letters characters jgjgjg','J1 alphas numbers fgdhdhd','L1 letters numbers haksjshs'], 'condition2':['J1 alphas numbers fgdhdhd','D1 letters numbers haksjshs','J1 alphas numbers fgdhdhd','D1 letters numbers haksjshs'..........}

Here is what I have done thus far..

flagInitial = False # flag to start copy after encountering condition

    with open(inputFilePath, "r") as tfile:

        for item in tfile:

            gcmatch = gcpattern.match(item)

            if gcmatch:

                extr = re.split(' ', item)
                laynum = extr[2]

                newKey = item[2:7] + laynum[:-1]
                flagInitial = True
                gcdict[newKey] = item
                continue

            if flagInitial == True:
                gcdict[newKey].append(item)  # stuck here 
                # print(gcdict[newKey])
                # print(newKey)

Am I missing syntax or something ?

标签： python list parsing dictionary append

2条回答

Explosion°爆炸

2楼-- · 2019-08-31 01:53

Try this:

In [46]: from collections import defaultdict

In [47]: d = defaultdict(list)

In [48]: cond = None
    ...: for i in mystring.splitlines():
    ...:     if 'condition' in i.split()[0]:
    ...:         cond = i.split()[0][:-1]        ...:         
    ...:     elif cond:
    ...:         d[cond].append(i)


In [49]: d
Out[49]: 
defaultdict(list,
            {'condition1': ['K2 letters characters jgjgjg',
              'J1 alphas numbers fgdhdhd',
              'L1 letters numbers haksjshs'],
             'condition2': ['J1 alphas numbers fgdhdhd',
              'D1 letters numbers haksjshs',
              'J1 alphas numbers fgdhdhd',
              'D1 letters numbers haksjshs']})

0人赞添加讨论(0) 举报

地球回转人心会变

3楼-- · 2019-08-31 01:55

With re.search function and collection.defaultdict object:

import re
import collections

with open('input.txt', 'rt') as f:
    pat = re.compile(r'^condition\d+')
    d = collections.defaultdict(list)
    curr_key = None

    for line in f:               
        m = pat.search(line)
        if m:
            curr_key = m.group()
            continue
        if curr_key:
            d[curr_key].append(line.strip())         

print(dict(d))

The output:

{'condition1': ['K2 letters characters jgjgjg', 'J1 alphas numbers fgdhdhd', 'L1 letters numbers haksjshs'], 'condition2': ['J1 alphas numbers fgdhdhd', 'D1 letters numbers haksjshs', 'J1 alphas numbers fgdhdhd', 'D1 letters numbers haksjshs']}

0人赞添加讨论(0) 举报

Parsing Text file and segregating the data in a Di

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间