Why does an empty dataframe fail an is.null() test

2019-03-12 05:32发布

问题:

please excuse me if my question is quite basic. I created an empty data frame by df <- data.frame() and obviously the data frame is NULL (empty). when I try to check if the data frame is empty by is.null(df), the result comes FALSE. Is there any difference between NULL and empty in R. In this case if the data frame is not NULL , then what is in the empty data frame and when it will be NULL. Thanks

回答1:

df is not NULL because it is a data frame and thus has some defined properties. For instance, it has a class. And you can get the number of rows in the data frame using nrow(df), even if the result should happen to be zero. Therefore, also the number of rows is well-defined.

As fas as I know, there is no is.empty command in base R. What you could do is, e.g., the following

is.data.frame(df) && nrow(df)==0

This will give TRUE for an empty data frame (that is, one with no rows) and false otherwise.

The reason for checking is.data.frame first is that nrow might cause an error, if it is applied to anything else than a data frame. Thanks to &&, nrow(df) will only be evaluated if df is a data frame.



回答2:

data.frame() creates an object that has a data frame class. Because the object exists, is.null will return FALSE. A NULL variable has no class and no contents.



回答3:

Above answers are correct, is.na and is.null couldn't not detect empty value in R. This is what I would do to calculate how many empty value you have in your data frame 'df' in this case.

is.na(df[df =='']) <- TRUE # this just replace NA to the empty value in df.

sum(is.na(df)) # would give you an idea how many empty values you have in your 'df'.

Hope this is helpful.