I am an beginner in R and don`t find a solution for the following problem. Any help would be really appreciated!
I have a data.frame and want to replace certain values of a column with defined other values.
data.frame
date<-c("19921231","19931231","19941231","19941231","19931231","19941231")
variable<-c("a","a","a","b","b","b")
value<-c(1:6)
dataframe <- data.frame(date,variable,value)
attempt to solve problem
yearend<-c("19921231","19931231","19941231")
year<-c("1992","1993","1994")
map = setNames(yearend,year)
dataframe[] = map[dataframe]
error message
Error in map[dataframe] : invalid subscript type 'list'
The problem is obviously, that it is not a matrix. What is the most efficient way to solve this problem? It should also work if I want to replace "real" character, e.g. "BGSFDS" with "BASF stock".
A nice function is mapvalues()
from the plyr package:
require(plyr)
dataframe$newdate <- mapvalues(dataframe$date,
from=c("19921231","19931231","19941231"),
to=c("1992","1993","1994"))
merge() might also be of help.
yearend<-c("19921231","19931231","19941231")
year<-c("1992","1993","1994")
map = data.frame(yearend,year)
merge(dataframe,map,by.x='date',by.y='yearend')
When you want to extract the year from the date, you can do this with the following line of code:
dataframe$year <- substr(dataframe$date,1,4)
When you want assign a class to the new variable simulataniously:
dataframe$year <- as.integer(substr(dataframe$date,1,4))
You can use match
:
dataframe <- transform(dataframe, Year = year[match(date, yearend)])
date variable value Year
1 19921231 a 1 1992
2 19931231 a 2 1993
3 19941231 a 3 1994
4 19941231 b 4 1994
5 19931231 b 5 1993
6 19941231 b 6 1994