Can anyone please tell me how to read only the first 6 months (7 columns) for each year of the data below, for example by using read.table()
?
Year Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2009 -41 -27 -25 -31 -31 -39 -25 -15 -30 -27 -21 -25
2010 -41 -27 -25 -31 -31 -39 -25 -15 -30 -27 -21 -25
2011 -21 -27 -2 -6 -10 -32 -13 -12 -27 -30 -38 -29
To read a specific set of columns from a dataset you, there are several other options:
1) With
fread
from thedata.table
-package:You can specify the desired columns with the
select
parameter fromfread
from thedata.table
package. You can specify the columns with a vector of column names or column numbers.For the example dataset:
Alternatively, you can use the
drop
parameter to indicate which columns should not be read:All result in:
UPDATE: When you don't want
fread
to return a data.table, use thedata.table = FALSE
-parameter, e.g.:fread("data.txt", select = c(1:7), data.table = FALSE)
2) With
read.csv.sql
from thesqldf
-package:Another alternative is the
read.csv.sql
function from thesqldf
package:3) With the
read_*
-functions from thereadr
-package:From the documentation an explanation for the used characters with
col_types
:Say the data are in file
data.txt
, you can use thecolClasses
argument ofread.table()
to skip columns. Here the data in the first 7 columns are"integer"
and we set the remaining 6 columns to"NULL"
indicating they should be skippedChange
"integer"
to one of the accepted types as detailed in?read.table
depending on the real type of data.data.txt
looks like this:and was created by using
where
dat
isIf the number of columns is not known beforehand, the utility function
count.fields
will read through the file and count the number of fields in each line.You could also use JDBC to achieve this. Let's create a sample csv file.
Download and save the the CSV JDBC driver from this link: http://sourceforge.net/projects/csvjdbc/files/latest/download