I am trying to retrieve files from 3 network drives using list.files
and it takes for ever. When I am using find
in the shell it returns all results in less then 15 seconds.
system.time(
jnk <- list.files(c("/Volumes/massspec", "/Volumes/massspec2", "/Volumes/massspec3"),
pattern='_MA_.*_HeLa_',
recursive=TRUE))
# user system elapsed
# 1.567 6.381 309.500
Here is the equivalent shell command.
time find /Volumes/masssp* -name *_MA_*_HeLa_*
# real 0m13.776s
# user 0m0.361s
# sys 0m0.620s
I need a solution which works on Windows and Unix systems. Has anyone a good idea? The network drives have altogether about 120,000 files but about 16TB. So not much files but very huge ones.
Based on the comment, I wrote a little R function which should work on Windows and Unix...
The whole thing is not much tested yet but it is working for my purpose.