I am using Java's Arrays.sort()
function to sort a list of files by their last modified time. The sort for 245 files is taking around 5 seconds. This seems too long to me. I feel it shouldn't take more than 0.5 seconds. Is that a good assumption? What am I doing wrong? OR does this sound normal?
public static class LastModifiedComparator implements Comparator<File> {
@Override
public int compare(File f1, File f2) {
return (int)(f1.lastModified() - f2.lastModified());
}
}
File folder = new File( "C:\\Whatever\\" );
File[] filesInFolder = folder.listFiles();
logger.debug("Starting File Sort");
Arrays.sort(filesInFolder, new LastModifiedComparator());
logger.debug("Done File Sort");
Output in Log
2012-08-10 14:24:20,333 DEBUG http-8080-4 <ClassName>:73 - Starting File Sort
2012-08-10 14:24:25,915 DEBUG http-8080-4 <ClassName>:75 - Done File Sort
File.lastModified
has to go to the OS to query when the file was last modified -- it's not cached. You're doing that twice per comparison, and Arrays.sort uses a mergesort --O(n log n)
. Plugging in 245 forn
, That's about 580 comparisons, or 1100 calls to the OS to get the last modified time. That means you can get about 230 last-modified calls per second. That seems maybe a bit slow, but certainly more plausible than an in-JVM comparison taking that longAs Marko Topolnik abd NgSan points out above, the fix would be to first cache the last-modified times for all the Files. I would do it by creating a new class object that combines the File and that time, and then sort those objects. That way you'll have only 245 calls to
File.lastModified
, and the sort should take about 1/5th the time.You will need to improve your
Comparator
logic. You need to cache thelastModified()
values because the implementation of that method is quite slow. I suggest wrapping theFile
instances into a comparable object of your making that will cache the value:The sort implemented in java in
modified Quick Sorttuned Merge Sort which will have average running time complexity of O(nlogn).So we need to concentrate on your File operations like getting lastModifiedTime. Are you sure those files are local files or shared drive which takes the network latency?
Your compare operation
isn't just a getter but issues a call to get the information from file system so the high response time of sort is moreover due to the performance of
lastModified()
thancompare()
.I don't know for sure, but it sounds like it's doing disk I/O each time you read the Modified Time- thus the slowness. It might be faster to simply get the modified times in an object along with the File object, and then sort.