32 bit Windows and the 2GB file size limit (C with

2019-01-24 16:10发布

问题:

I am attempting to port a small data analysis program from a 64 bit UNIX to a 32 bit Windows XP system (don't ask :)). But now I am having problems with the 2GB file size limit (long not being 64 bit on this platform).

I have searched this website and others for possible sollutions but cannot find any that are directly translatable to my problem. The problem is in the use of fseek and ftell.

Does anyone know of a modification to the following two functions to make them work on 32 bit Windows XP for files larger than 2GB (actually order 100GB).

It is vital that the return type of nsamples is a 64 bit integer (possibly int64_t).

long nsamples(char* filename)
{
  FILE *fp;
  long n;

  /* Open file */
  fp = fopen(filename, "rb");

  /* Find end of file */
  fseek(fp, 0L, SEEK_END);

  /* Get number of samples */
  n = ftell(fp) / sizeof(short);

  /* Close file */
  fclose(fp);

  /* Return number of samples in file */
  return n;
}

and

void readdata(char* filename, short* data, long start, int n)
{
  FILE *fp;

  /* Open file */
  fp = fopen(filename, "rb");

  /* Skip to correct position */
  fseek(fp, start * sizeof(short), SEEK_SET);

  /* Read data */
  fread(data, sizeof(short), n, fp);

  /* Close file */
  fclose(fp);
}

I tried using _fseeki64 and _ftelli64 using the following to replace nsamples:

__int64 nsamples(char* filename)
{
  FILE *fp;
  __int64 n;
  int result;

  /* Open file */
  fp = fopen(filename, "rb");
  if (fp == NULL)
  {
    perror("Error: could not open file!\n");
    return -1;
  }

  /* Find end of file */
  result = _fseeki64(fp, (__int64)0, SEEK_END);
  if (result)
  {
    perror("Error: fseek failed!\n");
    return result;
  }

  /* Get number of samples */
  n = _ftelli64(fp) / sizeof(short);

  printf("%I64d\n", n);

  /* Close file */
  fclose(fp);

  /* Return number of samples in file */
  return n;
}

for a file of 4815060992 bytes I get 260046848 samples (e.g. _ftelli64 gives 520093696 bytes) which is strange.

Curiously when I leave out the (__int64) cast in the call to _fseeki64 I get a runtime error (invalid argument).

Any ideas?

回答1:

There are two functions called _fseeki64 and _ftelli64 that support longer file offsets even on 32 bit Windows:

int _fseeki64(FILE *stream, __int64 offset, int origin);

__int64 _ftelli64(FILE *stream);


回答2:

sorry for not posting sooner but I have been preoccupied with other projects for a while. The following solution works:

__int64 nsamples(char* filename)
{
  int fh;
  __int64 n;

  /* Open file */
  fh = _open( filename, _O_BINARY );

  /* Find end of file */
  n = _lseeki64(fh, 0, SEEK_END);

  /* Close file */
  _close(fh);

 return n / sizeof(short);
}

The trick was using _open instead of fopen to open the file. I still don't understand exactly why this has to be done, but at least this works now. Thanks to everyone for your suggestions which eventually pointed me in the right direction.



回答3:

My BC says:

520093696 + 4294967296 => 4815060992

I'm guessing that your print routine is 32-bit. Your offset returned is most likely correct but being chopped off somewhere.



回答4:

And for gcc, see SO question 1035657. Where the advice is compile with the flag -D_FILE_OFFSET_BITS=64 so that the hidden variable(s) (of type off_t) used by the f-move-around functions is(are) 64-bits.

For MinGW: "Large-file support (LFS) has been implemented by redefining the stat and seek functions and types to their 64-bits equivalents. For fseek and ftell, separate LFS versions, fseeko and ftello, based on fsetpos and fgetpos, are provided in LibGw32C." (reference). In recent versions of gcc, fseeko and ftello are built-in and a separate library is not needed.