How do I get the size of a file in megabytes using

2020-02-10 03:24发布

问题:

I want to get the size of a file on disk in megabytes. Using the -s operator gives me the size in bytes, but I'm going to assume that then dividing this by a magic number is a bad idea:

my $size_in_mb = (-s $fh) / (1024 * 1024);

Should I just use a read-only variable to define 1024 or is there a programmatic way to obtain the amount of bytes in a kilobyte?

EDIT: Updated the incorrect calculation.

回答1:

If you'd like to avoid magic numbers, try the CPAN module Number::Bytes::Human.

use Number::Bytes::Human qw(format_bytes);
my $size = format_bytes(-s $file); # 4.5M


回答2:

You could of course create a function for calculating this. That is a better solution than creating constants in this instance.

sub size_in_mb {
    my $size_in_bytes = shift;
    return $size_in_bytes / (1024 * 1024);
}

No need for constants. Changing the 1024 to some kind of variable/constant won't make this code more readable.



回答3:

This is an old question and has been already correctly answered, but just in case your program is constrained to the core modules and you can not use Number::Bytes::Human here you have several other options I have been collected over time. I have kept them also because each one use a different Perl approach and is a nice example for TIMTOWTDI:

  • example 1: uses state to avoid reinitialize the variable each time (before perl 5.16 you need to use feature state or perl -E)

http://kba49.wordpress.com/2013/02/17/format-file-sizes-human-readable-in-perl/

    sub formatSize {
        my $size = shift;
        my $exp = 0;

        state $units = [qw(B KB MB GB TB PB)];

        for (@$units) {
            last if $size < 1024;
            $size /= 1024;
            $exp++;
        }

        return wantarray ? ($size, $units->[$exp]) : sprintf("%.2f %s", $size, $units->[$exp]);
    }
  • example 2: using sort map

.

sub scaledbytes {

    # http://www.perlmonks.org/?node_id=378580
    (sort { length $a <=> length $b 
          } map { sprintf '%.3g%s', $_[0]/1024**$_->[1], $_->[0]
                }[" bytes"=>0]
                ,[KB=>1]
                ,[MB=>2]
                ,[GB=>3]
                ,[TB=>4]
                ,[PB=>5]
                ,[EB=>6]
    )[0]
  }
  • example 3: Take advantage of the fact that 1 Gb = 1024 Mb, 1 Mb = 1024 Kb and 1024 = 2 ** 10:

.

# http://www.perlmonks.org/?node_id=378544
my $kb = 1024 * 1024; # set to 1 Gb

my $mb = $kb >> 10;
my $gb = $mb >> 10;

print "$kb kb = $mb mb = $gb gb\n";
__END__
1048576 kb = 1024 mb = 1 gb
  • example 4: use of ++$n and ... until .. to obtain an index for the array

.

# http://www.perlmonks.org/?node_id=378542
#! perl -slw
use strict;

sub scaleIt {
    my( $size, $n ) =( shift, 0 );
    ++$n and $size /= 1024 until $size < 1024;
    return sprintf "%.2f %s",
           $size, ( qw[ bytes KB MB GB ] )[ $n ];
}

my $size = -s $ARGV[ 0 ];

print "$ARGV[ 0 ]: ", scaleIt $size;  

Even if you can not use Number::Bytes::Human, take a look at the source code to see all the things that you need to be aware of.



回答4:

Well, there's not 1024 bytes in a meg, there's 1024 bytes in a K, and 1024 K in a meg...

That said, 1024 is a safe "magic" number that will never change in any system you can expect your program to work in.



回答5:

I would read this into a variable rather than use a magic number. Even if magic numbers are not going to change, like the number of bytes in a megabyte, using a well named constant is a good practice because it makes your code more readable. It makes it immediately apparent to everybody else what your intention is.



回答6:

1) You don't want 1024. That gives you kilobytes. You want 1024*1024, or 1048576.

2) Why would dividing by a magic number be a bad idea? It's not like the number of bytes in a megabyte will ever change. Don't overthink things too much.



回答7:

Don't get me wrong, but: I think that declaring 1024 as a Magic Variable goes a bit too far, that's a bit like "$ONE = 1; $TWO = 2;" etc.

A Kilobyte has been falsely declared as 1024 Bytes since more than 20 years, and I seriously doubt that the operating system manufacturers will ever correct that bug and change it to 1000.

What could make sense though is to declare non-obvious stuff, like "$megabyte = 1024 * 1024" since that is more readable than 1048576.



回答8:

Since the -s operator returns the file size in bytes you should probably be doing something like

my $size_in_mb = (-s $fh) / (1024 * 1024);

and use int() if you need a round figure. It's not like the dimensions of KB or MB is going to change anytime in the near future :)