Parsing text in Ruby

2020-07-24 03:18发布

I'm working on a script for importing component information for SketchUp. A very helpful individual on their help page, assisted me in creating one that works on an "edited" line by line text file. Now I'm ready to take it to the next level - importing directly from the original file created by FreePCB.

The portion of the file I wish to use is below: "sample_1.txt"

[parts]

part: C1
  ref_text: 1270000 127000 0 -7620000 1270000 1
  package: "CAP-AX-10X18-7X"
  value: "4.7pF" 1270000 127000 0 1270000 1270000 1
  shape: "CAP-AX-10X18-7"
  pos: 10160000 10160000 0 0 0

part: IC1
  ref_text: 1270000 177800 270 2540000 2286000 1
  package: "DIP-8-3X"
  value: "JRC 4558" 1270000 177800 270 10668000 508000 0
  shape: "DIP-8-3"
  pos: 2540000 27940000 0 90 0

part: R1
  ref_text: 1270000 127000 0 3380000 -600000 1
  package: "RES-CF-1/4W-4X"
  value: "470" 1270000 127000 0 2180000 -2900000 0
  shape: "RES-CF-1/4W-4"
  pos: 15240000 20320000 0 270 0

The word [parts], in brackets, is just a section heading. The information I wish to extract is the reference designator, shape, position, and rotation. I already have code to do this from a reformatted text file, using IO.readlines(file).each{ |line| data = line.split(" ");.

My current method uses a text file re-formatted as thus: "sample_2.txt"

C1 CAP-AX-10X18-7 10160000 10160000 0 0 0
IC1 DIP-8-3 2540000 27940000 0 90 0
R1 RES-CF-1/4W-4 15240000 20320000 0 270 0

I then use an array to extract data[0], data[1], data[2], data[3], and data[5]. Plus an additional step, to append ".skp" to the end of the package name, to allow the script to insert components with the same name as the package.

I would like to extract the information from the 1st example, without having to re-format the file, as is the case with the 2nd example. i.e. I know how to pull information from a single string, split by spaces - How do I do it, when the text for one array, appears on more than one line?

Thanks in advance for any help ;-)

EDIT: Below is the full code to parse "sample_2.txt", that was re-formatted prior to running the script.

    # import.rb - extracts component info from text file

    # Launch file browser
    file=UI.openpanel "Open Text File", "c:\\", "*.txt"

    # Do for each line, what appears in braces {}
    IO.readlines(file).each{ |line| data = line.split(" ");

    # Append second element in array "data[1]", with SketchUp file extension
    data[1] += ".skp"

    # Search for component with same name as data[1], and insert in component browser
    component_path = Sketchup.find_support_file data[1] ,"Components"
    component_def = Sketchup.active_model.definitions.load component_path

    # Create transformation from "origin" to point "location", convert data[] to float
    location = [data[2].to_f, data[3].to_f, 0]
    translation = Geom::Transformation.new location

    # Convert rotation "data[5]" to radians, and into float
    angle = data[5].to_f*Math::PI/180.to_f
    rotation = Geom::Transformation.rotation [0,0,0], [0,0,1], angle

    # Insert an instance of component in model, and apply transformation
    instance = Sketchup.active_model.entities.add_instance component_def, translation*rotation

    # Rename component 
    instance.name=data[0]

    # Ending brace for "IO.readlines(file).each{"
    }

Results in the following output, from running "import.rb" to open "sample_2.txt".

    C1 CAP-AX-10X18-7 10160000 10160000 0<br>IC1 DIP-8-3 2540000 27940000 90<br>R1 RES-CF-1/4W-4 15240000 20320000 270

I am trying to get the same results from the un-edited original file "sample_1.txt", without the extra step of removing information from the file, with notepad "sample_2.txt". The keywords, followed by a colon (part, shape, pos), only appear in this part of the document, and nowhere else, but... the document is rather lengthy, and I need the script to ignore all that appears before and after, the [parts] section.

2条回答
The star\"
2楼-- · 2020-07-24 04:06

Not sure exactly what you're asking, but hopefully this helps you get what you're looking for.

parts_text = <<EOS
[parts]

part: **C1**
  ref_text: 1270000 127000 0 -7620000 1270000 1
  package: "CAP-AX-10X18-7X"
  value: "4.7pF" 1270000 127000 0 1270000 1270000 1
  shape: "**CAP-AX-10X18-7**"
  pos: **10160000** **10160000** 0 **0** 0

part: **IC1**
  ref_text: 1270000 177800 270 2540000 2286000 1
  package: "DIP-8-3X"
  value: "JRC 4558" 1270000 177800 270 10668000 508000 0
  shape: "**DIP-8-3**"
  pos: **2540000** **27940000** 0 **90** 0

part: **R1**
  ref_text: 1270000 127000 0 3380000 -600000 1
  package: "RES-CF-1/4W-4X"
  value: "470" 1270000 127000 0 2180000 -2900000 0
  shape: "**RES-CF-1/4W-4**"
  pos: **15240000** **20320000** 0 **270** 0
EOS

parts = parts_text.split(/\n\n/)
split_parts = parts.each.map { |p| p.split(/\n/) }
split_parts.each do |part|
  stripped = part.each.collect { |p| p.strip }
  stripped.each do |line|
    p line.split(" ")
  end
end

This could be done much more efficiently with regular expressions, but I opted for methods that you might already be familiar with.

查看更多
神经病院院长
3楼-- · 2020-07-24 04:08

Your question is not clear, but this:

text.scan(/^\s+shape: "(.*?)"\s+pos: (\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)/)

will give you:

[["CAP-AX-10X18-7", "10160000", "10160000", "0", "0", "0"],
 ["DIP-8-3", "2540000", "27940000", "0", "90", "0"],
 ["RES-CF-1/4W-4", "15240000", "20320000", "0", "270", "0"]]

Added after change in the question

This:

text.scan(/^\s*part:\s*(.*?)$.*?\s+shape:\s*"(.*?)"\s+pos:\s*(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)/m)

will give you

[["C1", "CAP-AX-10X18-7", "10160000", "10160000", "0", "0", "0"],
 ["IC1", "DIP-8-3", "2540000", "27940000", "0", "90", "0"],
 ["R1", "RES-CF-1/4W-4", "15240000", "20320000", "0", "270", "0"]]

Second time Added after change in the question

This:

text.scan(/^\s*part:\s*(.*?)$.*?\s+shape:\s*"(.*?)"\s+pos:\s*(-?\d+)\s+(-?\d+)\s+(-?\d+)\s+(-?\d+)\s+(-?\d+)/m)

will let you capture numbers even if they are negative.

查看更多
登录 后发表回答