Pretty file size in Ruby?

2019-01-22 08:13发布

问题:

I'm trying to make a method that converts an integer that represents bytes to a string with a 'prettied up' format.

Here's my half-working attempt:

class Integer
  def to_filesize
    {
      'B'  => 1024,
      'KB' => 1024 * 1024,
      'MB' => 1024 * 1024 * 1024,
      'GB' => 1024 * 1024 * 1024 * 1024,
      'TB' => 1024 * 1024 * 1024 * 1024 * 1024
    }.each_pair { |e, s| return "#{s / self}#{e}" if self < s }
  end
end

What am I doing wrong?

回答1:

How about the Filesize gem ? It seems to be able to convert from bytes (and other formats) into pretty printed values:

example:

Filesize.from("12502343 B").pretty      # => "11.92 MiB"

http://rubygems.org/gems/filesize



回答2:

If you use it with Rails - what about standard Rails number helper?

http://api.rubyonrails.org/classes/ActionView/Helpers/NumberHelper.html#method-i-number_to_human_size

number_to_human_size(number, options = {})

?



回答3:

I agree with @David that it's probably best to use an existing solution, but to answer your question about what you're doing wrong:

  1. The primary error is dividing s by self rather than the other way around.
  2. You really want to divide by the previous s, so divide s by 1024.
  3. Doing integer arithmetic will give you confusing results, so convert to float.
  4. Perhaps round the answer.

So:

class Integer
  def to_filesize
    {
      'B'  => 1024,
      'KB' => 1024 * 1024,
      'MB' => 1024 * 1024 * 1024,
      'GB' => 1024 * 1024 * 1024 * 1024,
      'TB' => 1024 * 1024 * 1024 * 1024 * 1024
    }.each_pair { |e, s| return "#{(self.to_f / (s / 1024)).round(2)}#{e}" if self < s }
  end
end

lets you:

1.to_filesize
# => "1.0B"
1020.to_filesize
# => "1020.0B" 
1024.to_filesize
# => "1.0KB" 
1048576.to_filesize
# => "1.0MB"

Again, I don't recommend actually doing that, but it seems worth correcting the bugs.



回答4:

This is my solution:

def filesize(size)
  units = ['B', 'KiB', 'MiB', 'GiB', 'TiB', 'Pib', 'EiB']

  return '0.0 B' if size == 0
  exp = (Math.log(size) / Math.log(1024)).to_i
  exp = 6 if exp > 6 

  '%.1f %s' % [size.to_f / 1024 ** exp, units[exp]]
end

Compared to other solutions it's simpler, more efficient, and generates a more proper output.

Format

Both to_filesize and to_human have issues with big numbers. format_mb has a weird case where for example '1 MiB' is considered '1024 KiB' which is something some people might want, but certainly not me.

    origin:       filesize    to_filesize      format_mb       to_human
       0 B:          0.0 B           0.0B            0 b         0.00 B
       1 B:          1.0 B           1.0B            1 b         1.00 B
      10 B:         10.0 B          10.0B           10 b        10.00 B
    1000 B:       1000.0 B        1000.0B         1000 b      1000.00 B
     1 KiB:        1.0 KiB          1.0KB         1024 b        1.00 KB
   1.5 KiB:        1.5 KiB          1.5KB       1536.0 b        1.50 KB
    10 KiB:       10.0 KiB         10.0KB      10.000 kb       10.00 KB
   100 KiB:      100.0 KiB        100.0KB     100.000 kb      100.00 KB
  1000 KiB:     1000.0 KiB       1000.0KB    1000.000 kb     1000.00 KB
     1 MiB:        1.0 MiB          1.0MB    1024.000 kb        1.00 MB
     1 Gib:        1.0 GiB          1.0GB    1024.000 mb        1.00 GB
     1 TiB:        1.0 TiB          1.0TB    1024.000 gb        1.00 TB
     1 PiB:        1.0 Pib          ERROR    1024.000 tb        1.00 PB
     1 EiB:        1.0 EiB          ERROR    1024.000 pb        1.00 EB
     1 ZiB:     1024.0 EiB          ERROR    1024.000 eb          ERROR
     1 YiB:  1048576.0 EiB          ERROR 1048576.000 eb          ERROR

Performance

Also, it has the best performance.

                      user     system      total        real
filesize:         2.740000   0.000000   2.740000 (  2.747873)
to_filesize:      3.560000   0.000000   3.560000 (  3.557808)
format_mb:        2.950000   0.000000   2.950000 (  2.949930)
to_human:         5.770000   0.000000   5.770000 (  5.783925)

I tested each implementation with a realistic random number generator:

def numbers
  Enumerator.new do |enum|
    1000000.times do
      exp = rand(5)
      num = rand(1024 ** exp)
      enum.yield num
    end
  end
end


回答5:

You get points for adding a method to Integer, but this seems more File specific, so I would suggest monkeying around with File, say by adding a method to File called .prettysize().

But here is an alternative solution that uses iteration, and avoids printing single bytes as float :-)

def format_mb(size)
  conv = [ 'b', 'kb', 'mb', 'gb', 'tb', 'pb', 'eb' ];
  scale = 1024;

  ndx=1
  if( size < 2*(scale**ndx)  ) then
    return "#{(size)} #{conv[ndx-1]}"
  end
  size=size.to_f
  [2,3,4,5,6,7].each do |ndx|
    if( size < 2*(scale**ndx)  ) then
      return "#{'%.3f' % (size/(scale**(ndx-1)))} #{conv[ndx-1]}"
    end
  end
  ndx=7
  return "#{'%.3f' % (size/(scale**(ndx-1)))} #{conv[ndx-1]}"
end


回答6:

@Darshan Computing's solution is only partial here. Since the hash keys are not guaranteed to be ordered this approach will not work reliably. You could fix this by doing something like this inside the to_filesize method,

 conv={
      1024=>'B',
      1024*1024=>'KB',
      ...
 }
 conv.keys.sort.each { |s|
     next if self >= s
     e=conv[s]
     return "#{(self.to_f / (s / 1024)).round(2)}#{e}" if self < s }
 }

This is what I ended up doing for a similar method inside Float,

 class Float
   def to_human
     conv={
       1024=>'B',
       1024*1024=>'KB',
       1024*1024*1024=>'MB',
       1024*1024*1024*1024=>'GB',
       1024*1024*1024*1024*1024=>'TB',
       1024*1024*1024*1024*1024*1024=>'PB',
       1024*1024*1024*1024*1024*1024*1024=>'EB'
     }
     conv.keys.sort.each { |mult|
        next if self >= mult
        suffix=conv[mult]
        return "%.2f %s" % [ self / (mult / 1024), suffix ]
     }
   end
 end