Reading file using java and jcifs on windows. I need to determine size of file, which contains multi-byte as well as ASCII characters.
how can i achieve it efficiently OR any existing API in java?
Thanks,
Reading file using java and jcifs on windows. I need to determine size of file, which contains multi-byte as well as ASCII characters.
how can i achieve it efficiently OR any existing API in java?
Thanks,
No doubts, to get exact number of characters you have to read it with proper encoding. The question is how to read files efficiently. Java NIO is fastest known way to do that.
then
Reading into byte buffer is done with a speed near to maximum available ( for me it was like 60 Mb/sec while disk speed test gives about 70-75 Mb/sec)
To get the character count, you'll have to read the file. By specifying the correct file encoding, you ensure that Java correctly reads each character in your file.
BufferedReader.read() returns the Unicode character read (as an int in the range 0 to 65535). So the simple way to do it would be like this:
You will get faster performance using Reader.read(char[]):
For interest, I benchmarked these two and the nio version suggested in Andrey's answer. I found the second example above (countCharsBuffer) to be the fastest.
(Note that all these examples include line separator characters in their counts.)