Is there a Java equivalent of GetCompressedFileSiz

2019-02-27 04:07发布

问题:

I am looking to get accurate (i.e. the real size on disk and not the normal size that includes all the 0's) measurements of sparse files in Java.

In C++ on Windows one would use GetCompressedFileSize. I have yet to come across how one would go about doing that in Java?

If there isn't a direct equivalent, how would I go about measuring the data within a sparse file, as opposed to the size including all of the zeros?

For clarification, I am look for this to run the spare file measurements on both on Linux OS as well as Windows, however I don't mind coding two separate applications!

回答1:

If you are doing it on Windows alone, you can write it with Java Native Interface

class NativeInterface{
   public static native long GetCompressedFileSize(String filename);
}

and in C/C++ file:

extern "C"
JNIEXPORT jlong JNICALL Java_NativeInterface_GetCompressedFileSize
  (JNIEnv *env, jobject obj, jstring javaString)
{
    const char *nativeString = env->GetStringUTFChars(javaString, 0);

    char buffer[512];
    strcpy(buffer, nativeString);
    env->ReleaseStringUTFChars(javaString, nativeString);
    return (jlong) GetCompressedFileSize(buffer, NULL);
}


回答2:

If you want a pure Java solution you can try jnr-posix. Here's an example implementation

import jnr.posix.*;

final POSIX p = POSIXFactory.getPOSIX();
final int S_BLKSIZE = 512; // from sys/stat.h
final FileStat stat = p.stat("/path/to/file");
final long bytes = stat.blocks() * S_BLKSIZE;

However currently the function won't work for Windows. Until that's fixed you have to use platform-specific code like below

  • On Linux use the stat64 system call

    The st_blocks field indicates the number of blocks allocated to the file, 512-byte units. (This may be smaller than st_size/512 when the file has holes.)

    • You can also run the stat command. The number of allocated blocks can be seen in the Blocks field, or printed with the %b format specifier
    • Or use du command (without --apparent-size option)

      --apparent-size

      • print apparent sizes, rather than disk usage; although the apparent size is usually smaller, it may be larger due to holes in ('sparse') files, internal fragmentation, indirect blocks, and the like
  • On Windows you can call the GetCompressedFileSize API

    • Alternatively you can also run fsutil file layout with admin rights to get detailed information about a file. Find the $DATA stream.

      • If you see Resident | No clusters allocated in the flags like this then it's a resident file and size on disk would be 0.

        PS C:\Users>  fsutil file layout .\desktop.ini
        
        ********* File 0x000800000003dbde *********
        File reference number   : 0x000800000003dbde
        File attributes         : 0x00000026: Hidden | System | Archive
        File entry flags        : 0x00000000
        Link (ParentID: Name)   : 0x001f0000000238c8: HLINK Name   : \Users\desktop.ini
        ...
        Stream                  : 0x080  ::$DATA
            Attributes          : 0x00000000: *NONE*
            Flags               : 0x0000000c: Resident | No clusters allocated
            Size                : 174
            Allocated Size      : 176
        
      • If you don't see the resident flag then check the Allocated Size field, it's the file's size on disk

        PS D:\>  fsutil file layout .\nonresident.txt
        
        ********* File 0x000400000000084e *********
        File reference number   : 0x000400000000084e
        File attributes         : 0x00000020: Archive
        File entry flags        : 0x00000000
        Link (ParentID: Name)   : 0x0005000000000005: HLINK Name   : \nonresident.txt
        ...
        Stream                  : 0x080  ::$DATA
            Attributes          : 0x00000000: *NONE*
            Flags               : 0x00000000: *NONE*
            Size                : 1,520
            Allocated Size      : 4,096
            Extents             : 1 Extents
                                : 1: VCN: 0 Clusters: 1 LCN: 1,497,204
        

For more information you can read the below questions

  • How do I query "Size on disk" file information?
  • Get size of file on disk


回答3:

Since an answer was given for windows. i will try to supply for Linux.

I am not sure, but i think it will do the trick (C++):

#include <linux/fs.h>
ioctl(file, BLKGETSIZE64, &file_size_in_bytes);

This can be loaded in the same way that was described in the @Aniket answer (JNI)