Parsing and structuring of a text file

2019-09-04 09:42发布

I need help and I use Ruby. I had a text file with next contain:

Head 1
a 10
b 14
c 15
d 16
e 17
f 88
Head 4
r 32
t 55
s 79
r 22
t 88
y 53
o 78
p 90
m 44
Head 53
y 22
b 33
Head 33
z 11
d 66
v 88
b 69
Head 32
n 88
m 89
b 88

And I want parse and structure this file to next plane. I want to get next data:

Head 1, f 88
Head 4, t 88
Head 33, v 88
Head 32, n 88
Head 32, b 88

Please tell me how how can I make such code on a ruby?

I think first I have its put all the lines in the array:

lines = Array.new
File.open('C:/file/file.txt', 'r').each { |line| lines << line }

but what should I do next?

Thanks!

标签: ruby parsing
2条回答
Juvenile、少年°
2楼-- · 2019-09-04 10:03

I have written your data to the file 'temp':

First define a regular expression for extracting the lines of the file that are of interest.

r = /
    Head\s+\d+        # match 'Head', > 0 spaces, ?= 1 digits in capture group 1
    |                 # or
    [[:lower:]]+\s+88 # match > 0 lower case letters, > 0 spaces, '88'
    /xm               # free-spacing regex definition and multi-line modes

Now perform the following operations on the file.

File.read('temp').scan(r).
                  slice_before { |line| line.start_with?('Head ') }.
                  reject { |a| a.size == 1 }.
                  flat_map { |head, *rest| [head].product(rest) }.
                  map { |a| "%s, %s" % a }
  #=> ["Head 1, f 88", "Head 4, t 88", "Head 33, v 88",
  #    "Head 32, n 88", "Head 32, b 88"]

The steps are as follows.

a = File.read('temp').scan(r)
  #=> ["Head 1", "f 88", "Head 4", "t 88", "Head 53", "Head 33",
  #    "v 88", "Head 32", "n 88", "b 88"]
b = a.slice_before { |line| line.start_with?('Head') }
  #=> #<Enumerator: #<Enumerator::Generator:0x007ffd218387b0>:each> 

We can see the elements that will be generated by the enumerator b by converting it to an array.

b.to_a
  #=> [["Head 1", "f 88"], ["Head 4", "t 88"], ["Head 53"],
  #    ["Head 33", "v 88"], ["Head 32", "n 88", "b 88"]]

Now remove all arrays of size 1 from b.

c = b.reject { |a| a.size == 1 }
  #=> [["Head 1", "f 88"], ["Head 4", "t 88"], ["Head 33", "v 88"],
  # ["Head 32", "n 88", "b 88"]]

Next we use Enumerable#flat_map and Array#product to associate each "Head" with all the lines following (before the next "Head" or the end of the file) that end 88\n.

d = c.flat_map { |head, *rest| [head].product(rest) }
  #=> [["Head 1", "f 88"], ["Head 4", "t 88"], ["Head 33", "v 88"],
  #    ["Head 32", "n 88"], ["Head 32", "b 88"]]

Lastly, convert each element of d to a string.

d.map { |a| "%s, %s" % a }
  #=> ["Head 1, f 88", "Head 4, t 88", "Head 33, v 88",
  #    "Head 32, n 88", "Head 32, b 88"] 
查看更多
够拽才男人
3楼-- · 2019-09-04 10:10

If the answer to @mudasobwa question "Do you want to grab everything having 88 value?" this is the solution

lines = File.open("file.txt").to_a
lines.map!(&:chomp) # remove line breaks

current_head = ""
res = []

lines.each do |line|
  case line
  when /Head \d+/
    current_head = line
  when /\w{1} 88/
    res << "#{current_head}, #{line}"
  end
end

puts res
查看更多
登录 后发表回答