i am new to python , and I want to extract the data from this format
FBpp0143497 5 151 5 157 PF00339.22 Arrestin_N Domain 1 135 149 83.4 1.1e-23 1 CL0135
FBpp0143497 183 323 183 324 PF02752.15 Arrestin_C Domain 1 137 138 58.5 6e-16 1 CL0135
FBpp0131987 60 280 51 280 PF00089.19 Trypsin Domain 14 219 219 127.7 3.7e-37 1 CL0124
to this format
FBpp0143497
5 151 Arrestin_N 1.1e-23
FBpp0143497
183 323 Arrestin_C 6e-16
I have written code in hope that it works but it does not work , please help!
file = open('/ddfs/user/data/k/ktrip_01/hmm.txt','r')
rec = file.read()
for line in rec :
field = line.split("\t")
print field
print field[:]
print '>',field[0]
print field[1], field[2], field[6], field[12]
the hmmtext file is
FBpp0143497 5 151 5 157 PF00339.22 Arrestin_N Domain 1 135 149 83.4 1.1e-23 1 CL0135
FBpp0143497 183 323 183 324 PF02752.15 Arrestin_C Domain 1 137 138 58.5 6e-16 1 CL0135
FBpp0131987 60 280 51 280 PF00089.19 Trypsin Domain 14 219 219 127.7 3.7e-37 1 CL0124
to iterate over a file line-by-line, you should do:
This line:
reads your whole file into
rec
, line breaks and all. You probably want to do this:This is just one way to read lines from a file in Python. It's not always the best way, because this will load all the lines of the file into memory. If your input file contains, say, three million lines, it might be better to read and process each line one at a time.
Use the
csv
module to parse your tab-separated fields:yields: