How do you convert a data frame column to a numeric type?
相关问题
- R - Quantstart: Testing Strategy on Multiple Equit
- How to remove spaces in between characters without
- Using predict with svyglm
- Reshape matrix by rows
- Extract P-Values from Dunnett Test into a Table by
相关文章
- How to convert summary output to a data frame?
- How to plot smoother curves in R
- Paste all possible diagonals of an n*n matrix or d
- ess-rdired: I get this error “no ESS process is as
- How to use doMC under Windows or alternative paral
- dyLimit for limited time in Dygraphs
- Saving state of Shiny app to be restored later
- What is (INT32_MIN + 1) when int32_t is an extende
While your question is strictly on numeric, there are many conversions that are difficult to understand when beginning R. I'll aim to address methods to help. This question is similar to This Question.
Type conversion can be a pain in R because (1) factors can't be converted directly to numeric, they need to be converted to character class first, (2) dates are a special case that you typically need to deal with separately, and (3) looping across data frame columns can be tricky. Fortunately, the "tidyverse" has solved most of the issues.
This solution uses
mutate_each()
to apply a function to all columns in a data frame. In this case, we want to apply thetype.convert()
function, which converts strings to numeric where it can. Because R loves factors (not sure why) character columns that should stay character get changed to factor. To fix this, themutate_if()
function is used to detect columns that are factors and change to character. Last, I wanted to show how lubridate can be used to change a timestamp in character class to date-time because this is also often a sticking block for beginners.In my PC (R v.3.2.3),
apply
orsapply
give error.lapply
works well.Since (still) nobody got check-mark, I assume that you have some practical issue in mind, mostly because you haven't specified what type of vector you want to convert to
numeric
. I suggest that you should applytransform
function in order to complete your task.Now I'm about to demonstrate certain "conversion anomaly":
Let us have a glance at
data.frame
and let us run:
Now you probably ask yourself "Where's an anomaly?" Well, I've bumped into quite peculiar things in R, and this is not the most confounding thing, but it can confuse you, especially if you read this before rolling into bed.
Here goes: first two columns are
character
. I've deliberately called 2nd onefake_char
. Spot the similarity of thischaracter
variable with one that Dirk created in his reply. It's actually anumerical
vector converted tocharacter
. 3rd and 4th column arefactor
, and the last one is "purely"numeric
.If you utilize
transform
function, you can convert thefake_char
intonumeric
, but not thechar
variable itself.but if you do same thing on
fake_char
andchar_fac
, you'll be lucky, and get away with no NA's:If you save transformed
data.frame
and check formode
andclass
, you'll get:So, the conclusion is: Yes, you can convert
character
vector into anumeric
one, but only if it's elements are "convertible" tonumeric
. If there's just onecharacter
element in vector, you'll get error when trying to convert that vector tonumerical
one.And just to prove my point:
And now, just for fun (or practice), try to guess the output of these commands:
Kind regards to Patrick Burns! =)
With the following code you can convert all data frame columns to numeric (X is the data frame that we want to convert it's columns):
and for converting whole matrix into numeric you have two ways: Either:
or:
Alternatively you can use
data.matrix
function to convert everything into numeric, although be aware that the factors might not get converted correctly, so it is safer to convert everything tocharacter
first:I usually use this last one if I want to convert to matrix and numeric simultaneously
If you run into problems with:
Take a look to your decimal marks. If they are "," instead of "." (e.g. "5,3") the above won't work.
A potential solution is:
I believe this is quite common in some non English speaking countries.
Tim is correct, and Shane has an omission. Here are additional examples:
Our
data.frame
now has a summary of the factor column (counts) and numeric summaries of theas.numeric()
--- which is wrong as it got the numeric factor levels --- and the (correct) summary of theas.numeric(as.character())
.