How to find number of characters in a file without

2019-06-17 04:21发布

In a project, I have to read a file, and i have to work with the number of characters in a file, and is there a way to get number of characters without reading it character by character (otherwise i will have to read the file twice, once just to find the number of characters in it).

Is it even possible?

5条回答
仙女界的扛把子
2楼-- · 2019-06-17 04:30

I think you are likely looking for a dynamic memory solution. What you actually asked is "is there a way to get the number of characters in a file without reading it?". The answer (assuming one byte per character) is yes, you can use the stat call to get the file size, and the file size in bytes is the number of characters. With UTF-8 the answer is no, but let's put that aside for the moment since just-learning computer scientists usually don't worry about internationalization.

I think the reason you want to know how many characters there are is so that you can have storage big enough to hold them all. You don't need to know how big the file is to store the whole thing.

If you have an std::vector<char>, it can start out able to hold ten characters, then grow to hold twenty, then ten thousand... And when you're done reading the file, it will hold them all, even though you never knew how many there would be.

查看更多
祖国的老花朵
3楼-- · 2019-06-17 04:33

Off the top of my head is so have a look at the file size and divide that by how many bytes a single character is?

Problems arise when dealing with white space and end lines etc.

查看更多
劳资没心,怎么记你
4楼-- · 2019-06-17 04:45

You can try this:

FILE *fp = ... /*open as usual*/;
fseek(fp, 0L, SEEK_END);
size_t fileSize = ftell(fp);

However, this returns the number of bytes in the file, not the number of characters. It is not the same unless the encoding is known to be one byte per character (e.g. ASCII).

You'd need to "rewind" the file back to the beginning after you've learned the size:

fseek(fp, 0L, SEEK_SET);
查看更多
Deceive 欺骗
5楼-- · 2019-06-17 04:45

The simple answer is no. More precisely, it's system dependent: under Unix, it's possible (e.g. using stat); under Windows, it's not possible for a text file, but if you're reading the file in binary, there's a function GetFileSize which can be used.

Although not guaranteed, under all of the implementations I know (for these two platforms), seeking to the end of the file, then doing an ftell, will return something which, when converted to a sufficiently large integral type, will give the same results as the above (with the same restrictions).

Finally: why do you need this information? If it's just to allocate an appropriately sized buffer, even with a text file, GetFileSize (and tell after seeking to the end) will return a value slightly larger than the number of bytes you can read. You're buffer will be slightly oversized, but this is generally not a problem.

查看更多
做个烂人
6楼-- · 2019-06-17 04:52

Yes.

Seek to the end get the position of the end that is the size.

FILE*  file = fopen("Plop");
fseek(file, 0, SEEK_END);
size_t  size = ftell(file);      // This is the size of the file.
                                 // But note it is in bytes.
                                 // Also note if you are reading it into memory this is
                                 // is the value you want unless you plan to dynamically
                                 // convert the character encoding as you read.

fseek(file, 0, SEEK_SET);        // Move the position back to the start.

In C++ the stream have the same functionality:

std::ifstream   file("Plop");
file.seekg(0, std::ios_base::end);
size_t size = file.tellg();

file.seekg(0, std::ios_base::beg);
查看更多
登录 后发表回答