Number crunching in Ruby (optimisation needed)

2019-07-22 05:22发布

Ruby may not be the optimal language for this but I'm sort of comfortable working with this in my terminal so that's what I'm going with.

I need to process the numbers from 1 to 666666 so I pin out all the numbers that contain 6 but doesn't contain 7, 8 or 9. The first number will be 6, the next 16, then 26 and so forth. Then I needed it printed like this (6=6) (16=6) (26=6) and when I have ranges like 60 to 66 I need it printed like (60 THRU 66=6) (SPSS syntax).

I have this code and it works but it's neither beautiful nor very efficient so how could I optimize it?

(silly code may follow)

class Array
  def to_ranges
    array = self.compact.uniq.sort
    ranges = []
    if !array.empty?
      # Initialize the left and right endpoints of the range
      left, right = array.first, nil
      array.each do |obj|
        # If the right endpoint is set and obj is not equal to right's successor 
        # then we need to create a range.
        if right && obj != right.succ
          ranges << Range.new(left,right)
          left = obj
        end
        right = obj
      end
      ranges << Range.new(left,right) unless left == right
    end
    ranges
  end
end

write = ""
numbers = (1..666666).to_a

# split each number in an array containing it's ciphers
numbers = numbers.map { |i| i.to_s.split(//) }

# delete the arrays that doesn't contain 6 and the ones that contains 6 but also 8, 7 and 9
numbers = numbers.delete_if { |i| !i.include?('6') }
numbers = numbers.delete_if { |i| i.include?('7') }
numbers = numbers.delete_if { |i| i.include?('8') }
numbers = numbers.delete_if { |i| i.include?('9') }

# join the ciphers back into the original numbers
numbers = numbers.map { |i| i.join }

numbers = numbers.map { |i| i = Integer(i) }

# rangify consecutive numbers
numbers = numbers.to_ranges

# edit the ranges that go from 1..1 into just 1
numbers = numbers.map do |i|
  if i.first == i.last
    i = i.first
  else
    i = i
  end
end

# string stuff
numbers = numbers.map { |i| i.to_s.gsub(".."," thru ") }

numbers = numbers.map { |i| "(" + i.to_s + "=6)"}

numbers.each { |i| write << " " + i }

File.open('numbers.txt','w') { |f| f.write(write) }

As I said it works for numbers even in the millions - but I'd like some advice on how to make prettier and more efficient.

10条回答
贼婆χ
2楼-- · 2019-07-22 06:09

(I didn't bother updating my C solution for formatting. Instead I went with x3ro's excellent ruby version and optimized that)

Undeleted:

I still am not sure whether the changed range-notation behaviour isn't actually what the OP wants: This version changes the behaviour of breaking up ranges that are actually contiguous modulo 6; I wouldn't be surprised the OP actually expected .

....
(555536=6)
(555546=6)
(555556 THRU 666666=6)

instead of

....
(666640 THRU 666646=6)
(666650 THRU 666656=6)
(666660 THRU 666666=6)

I'll let the OP decide, and here is the modified version, which runs in 18% of the time as x3ro's version (3.2s instead of 17.0s when generating up to 6666666 (7x6)).

def check(n)
    n.to_s(7).include?('6')
end

def spss(ranges)
  ranges.each do |range|
    if range.first === range.last
      puts "(" + range.first.to_s(7) + "=6)"
    else
      puts "(" + range.first.to_s(7) + " THRU " + range.last.to_s(7) + "=6)"
    end
  end
end

range = (1..117648)

range = range.select { |n| check(n) }

range = range.inject([0..0]) do |ranges, n|
  temp = ranges.last
  if temp.last + 1 === n
    ranges.pop
    ranges.push(temp.first..n)
  else
    ranges.push(n..n)
  end
end

spss(range)
查看更多
三岁会撩人
3楼-- · 2019-07-22 06:09

The killer here is

numbers = (1..666666).to_a

Range supports iterations so you would be better off by going over the whole range and accumulating numbers that include your segments in blocks. When one block is finished and supplanted by another you could write it out.

查看更多
叼着烟拽天下
4楼-- · 2019-07-22 06:13

I deleted my earlier attempt to parlez-vous-ruby? and made up for that. I know have an optimized version of x3ro's excellent example.

$,="\n"
puts ["(0=6)", "(6=6)", *(1.."66666".to_i(7)).collect {|i| i.to_s 7}.collect do |s|
    s.include?('6')? "(#{s}0 THRU #{s}6=6)" : "(#{s}6=6)"
end ]

Compared to x3ro's version

... It is down to three lines

... 204.2 x faster (to 66666666)

... has byte-identical output

It uses all my ideas for optimization

  • gen numbers based on modulo 7 digits (so base-7 numbers)
  • generate the last digit 'smart': this is what compresses the ranges

So... what are the timings? This was testing with 8 digits (to 66666666, or 823544 lines of output):

$ time ./x3ro.rb >  /dev/null

real    8m37.749s
user    8m36.700s
sys 0m0.976s

$ time ./my.rb >  /dev/null

real    0m2.535s
user    0m2.460s
sys 0m0.072s

Even though the performance is actually good, it isn't even close to the C optimized version I posted before: I couldn't run my.rb to 6666666666 (6x10) because of OutOfMemory. When running to 9 digits, this is the comparative result:

sehe@meerkat:/tmp$ time ./my.rb >  /dev/null

real    0m21.764s
user    0m21.289s
sys 0m0.476s

sehe@meerkat:/tmp$ time ./t2 > /dev/null

real    0m1.424s
user    0m1.408s
sys 0m0.012s

The C version is still some 15x faster... which is only fair considering that it runs on the bare metal.

Hope you enjoyed it, and can I please have your votes if only for learning Ruby for the purpose :)

(Can you tell I'm proud? This is my first encounter with ruby; I started the ruby koans 2 hours ago...)

Edit by @johndouthat:

Very nice! The use of base7 is very clever and this a great job for your first ruby trial :)

Here's a slight modification of your snippet that will let you test 10+ digits without getting an OutOfMemory error:

puts ["(0=6)", "(6=6)"]
(1.."66666666".to_i(7)).each do |i| 
  s = i.to_s(7)
  puts s.include?('6') ? "(#{s}0 THRU #{s}6=6)" : "(#{s}6=6)"
end

# before:
real    0m26.714s
user    0m23.368s
sys 0m2.865s
# after
real    0m15.894s
user    0m13.258s
sys 0m1.724s
查看更多
不美不萌又怎样
5楼-- · 2019-07-22 06:13
$range_start = -1
$range_end = -1
$f = File.open('numbers.txt','w')

def output_number(i)
  if $range_end ==  i-1
    $range_end = i
  elsif $range_start < $range_end
    $f.puts "(#{$range_start} thru #{$range_end})"
    $range_start = $range_end = i
  else
    $f.puts "(#{$range_start}=6)" if $range_start > 0 # no range, print out previous number
    $range_start = $range_end = i
  end
end

'1'.upto('666') do |n|
  next unless n =~ /6/ # keep only numbers that contain 6
  next if n =~ /[789]/ # remove nubmers that contain 7, 8 or 9
  output_number n.to_i
end
if $range_start < $range_end
  $f.puts "(#{$range_start} thru #{$range_end})"
end
$f.close

puts "Ruby is beautiful :)"
查看更多
登录 后发表回答