R - split data frame and save to different files

2020-07-29 05:27发布

问题:

I have a data frame with monthly temperature data for several locations:

    > df4[1:36,]
       location    variable cut month year freq
1    Adamantina temperature  10   Jan 1981 21.0
646  Adamantina temperature  10   Feb 1981 20.5
1291 Adamantina temperature  10   Mar 1981 21.5
1936 Adamantina temperature  10   Apr 1981 21.5
2581 Adamantina temperature  10   May 1981 24.0
3226 Adamantina temperature  10   Jun 1981 21.5
3871 Adamantina temperature  10   Jul 1981 22.5
4516 Adamantina temperature  10   Aug 1981 23.5
5161 Adamantina temperature  10   Sep 1981 19.5
5806 Adamantina temperature  10   Oct 1981 21.5
6451 Adamantina temperature  10   Nov 1981 23.0
7096 Adamantina temperature  10   Dec 1981 19.0
2        Adolfo temperature  10   Jan 1981 24.0
647      Adolfo temperature  10   Feb 1981 20.0
1292     Adolfo temperature  10   Mar 1981 24.0
1937     Adolfo temperature  10   Apr 1981 23.0
2582     Adolfo temperature  10   May 1981 18.0
3227     Adolfo temperature  10   Jun 1981 21.0
3872     Adolfo temperature  10   Jul 1981 22.0
4517     Adolfo temperature  10   Aug 1981 19.0
5162     Adolfo temperature  10   Sep 1981 19.0
5807     Adolfo temperature  10   Oct 1981 24.0
6452     Adolfo temperature  10   Nov 1981 24.0
7097     Adolfo temperature  10   Dec 1981 24.0
3         Aguai temperature  10   Jan 1981 24.0
648       Aguai temperature  10   Feb 1981 20.0
1293      Aguai temperature  10   Mar 1981 22.0
1938      Aguai temperature  10   Apr 1981 20.0
2583      Aguai temperature  10   May 1981 21.5
3228      Aguai temperature  10   Jun 1981 20.5
3873      Aguai temperature  10   Jul 1981 24.0
4518      Aguai temperature  10   Aug 1981 23.5
5163      Aguai temperature  10   Sep 1981 18.5
5808      Aguai temperature  10   Oct 1981 21.0
6453      Aguai temperature  10   Nov 1981 22.0
7098      Aguai temperature  10   Dec 1981 23.5

What I need to do is to programmatically split this data frame by location and create a .Rdata file for every location.

In the example above, I would have three different files - Adamantina.Rdata, Adolfo.Rdata and Aguai.Rdata - containing all the columns but only the rows corresponding to those locations.

It needs to be efficient and programmatic, because in my actual data I have about 700 different locations and about 50 years of data for every location.

Thanks in advance.

回答1:

This is borrowing from a previous answer, but I don't believe that answer does you want.

First, as they suggest, you want to split up your data set.

splitData <- split(df4, df4$location)

Now, to go through this list and one by one, save your datasetset, this can be done with by pulling off the names:

 allNames <- names(splitData)
 for(thisName in allNames){
     saveName = paste0(thisName, '.Rdata')
     saveRDS(splitData[[thisName]], file = saveName)
}


回答2:

To split data frame, use split(df4, df4$location). It will create data frames named Adamantina, Adolfo, Aguai, etc.

And to save these new data frames into locations.RData file, use save(Adamantina, Adolfo, Aguai, file="locations.RData"). save.image(file="filename.RData") will save everything in current R session into filename.RData file.

You can read more about save and save.image here.

Edit:

If number of splits is way too large, then use this approach:

locations <- split(df4, df4$location)
save(locations, "locations.RData")

locations.RData will then load as a list.