Suppose we have a folder containing multiple data.csv files, each containing the same number of variables but each from different times. Is there a way in R to import them all simultaneously rather than having to import them all individually?
My problem is that I have around 2000 data files to import and having to import them individually just by using the code:
read.delim(file="filename", header=TRUE, sep="\t")
is not very efficient.
As well as using
lapply
or some other looping construct in R you could merge your CSV files into one file.In Unix, if the files had no headers, then its as easy as:
or if there are headers, and you can find a string that matches headers and only headers (ie suppose header lines all start with "Age"), you'd do:
I think in Windows you could do this with
COPY
andSEARCH
(orFIND
or something) from the DOS command box, but why not installcygwin
and get the power of the Unix command shell?Something like the following should result in each data frame as a separate element in a single list:
This assumes that you have those CSVs in a single directory--your current working directory--and that all of them have the lower-case extension
.csv
.If you then want to combine those data frames into a single data frame, see the solutions in other answers using things like
do.call(rbind,...)
,dplyr::bind_rows()
ordata.table::rbindlist()
.If you really want each data frame in a separate object, even though that's often inadvisable, you could do the following with
assign
:Or, without
assign
, and to demonstrate (1) how the file name can be cleaned up and (2) show how to uselist2env
, you can try the following:But again, it's often better to leave them in a single list.
I use this successfully:
In my view, most of the other answers are obsoleted by
rio::import_list
, which is a succinct one-liner:Any extra arguments are passed to
rio::import
.rio
can deal with almost any file format R can read, and it usesdata.table
'sfread
where possible, so it should be fast too.Building on dnlbrk's comment, assign can be considerably faster than list2env for big files.
By setting the full.names argument to true, you will get the full path to each file as a separate character string in your list of files, e.g., List_of_file_paths[1] will be something like "C:/Users/Anon/Documents/Folder_with_csv_files/file1.csv"
You could use the data.table package's fread or base R read.csv instead of read_csv. The file_name step allows you to tidy up the name so that each data frame does not remain with the full path to the file as it's name. You could extend your loop to do further things to the data table before transferring it to the global environment, for example:
This is the code I developed to read all csv files into R. It will create a dataframe for each csv file individually and title that dataframe the file's original name (removing spaces and the .csv) I hope you find it useful!