I have an R dataframe and I'm trying to subtract one column from another. I extract the columns using the $
operator but the class of the columns is 'factor' and R won't perform arithmetic operations on factors. Are there special functions to do this?
相关问题
- R - Quantstart: Testing Strategy on Multiple Equit
- Using predict with svyglm
- Reshape matrix by rows
- Extract P-Values from Dunnett Test into a Table by
- split data frame into two by column value [duplica
相关文章
- How to convert summary output to a data frame?
- How to plot smoother curves in R
- Paste all possible diagonals of an n*n matrix or d
- ess-rdired: I get this error “no ESS process is as
- How to use doMC under Windows or alternative paral
- dyLimit for limited time in Dygraphs
- Saving state of Shiny app to be restored later
- How to insert pictures into each individual bar in
You should double check how you're pulling in the data first. If these are truly numeric columns R should recognize this (Excel messes up sometimes). Either way, it could be being coerced to a factor because there are other undesirables in the columns. The responses that you've received so far haven't mentioned that as.numeric() only returns the level numbers. Meaning that you won't be performing the operation on the actual numbers that have been converted to factors but rather the level numbers associated with each factor.
You can define your own operators to do that, see
? Arith
. Without group generics, you can define your own binary operators %operator%:If you really want the levels of the factor to be used, you're either doing something very wrong or too clever for its own good.
If what you have is a factor containing numbers stored in the levels of the factor, then you want to coerce it to numeric first using
as.numeric(as.character(...))
:You can see the difference between accessing the factor indices and assigning the factor contents here:
Timings vs. an alternative approach which only does the conversion on the levels shows it's faster if levels are not unique to each element:
Therefore, if
length(levels(dat$f)) < length(dat$f)
, useas.numeric(levels(dat$f))[dat$f]
for a substantial speed gain.If
length(levels(dat$f))
is approximately equal tolength(dat$f)
, there is no speed gain:You'll need to convert the factors to numeric arrays.