How to automate subsetting multiple files using r

2019-09-21 01:56发布

问题:

Hi guys am new to R and I am comfortable with creating subsets if i handle one file at a time .... But I am having trouble automating that to multiple files...So in my case,I want to automate the process of subsetting multiple csv files which are present in multiple subfolders of a given folder ...I want to create multiple subset files which include say the the 100 rows of each file and write them into new files and the name of the subsetted files should be same as that of the file from which they were subsetted... Any help appreciated... Thanks!!!

回答1:

I created a couple of subfolders in my folder Temp. If the working directory is Temp. Assuming that the number of rows in each dataset is >= 100

files <- list.files(recursive=TRUE, full.names=TRUE)
files
#[1] "./Temp1/file1.csv"   "./Temp2/file2_2.csv" "./Temp2/file2.csv" 

lst1 <- lapply(files, function(x) read.csv(x, sep='')[1:100,])
Pref <- sub("/[^/]+$", '', files)

The subset files are then written to the corresponding folders along with the old file.

invisible(lapply(seq_along(lst1), function(i) 
            write.csv(lst1[[i]],paste(Pref[i],paste0('Subset',
           basename(files[i])), sep="/"), quote=FALSE, row.names=FALSE)))

list.files(recursive=TRUE, full.names=TRUE)
#[1] "./Temp1/file1.csv"         "./Temp1/Subsetfile1.csv"  
#[3] "./Temp2/file2_2.csv"       "./Temp2/file2.csv"        
#[5] "./Temp2/Subsetfile2_2.csv" "./Temp2/Subsetfile2.csv"  


标签: r csv subset