可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I am trying to combine two dataframes with different number of columns and column headers. However, after I combine them using rbind.fill(), the resulting file has filled the empty cells with NA.

This is very inconvenient since one of the columns has data that is also represented as "NA" (for North America), so when I import it into a csv, the spreadsheet can't tell them apart.

Is there a way for me to:

Use the rbind.fill function without having it populate the empty cells with NA

Change the column to replace the NA values*

*I've scoured the blogs, and have tried the two most popular solutions:

df$col[is.na(df$col)] <- 0, #it does not work
df$col = ifelse(is.na(df$col), "X", df$col), #it changes all the characters to numbers, and ruins the column

Let me know if you have any advice! I (unfortunately) cannot share the df, but will be willing to answer any questions!

回答1:

NA is not the same as "NA" to R, but might be interpreted as such by your favourite spreadsheet program. NA is a special value in R just like NaN (not a number). If I understand correctly, one of your solutions is to replace the "NA" values in the column representing North America with something else, in which case you should just be able to do...

df$col[ df$col == "NA" ] <- "NorthAmerica"

This is assuming that your "NA" values are actually character strings. is.na() won't return any values if they are character strings which is why df$col[ is.na(df$col) ] <- 0 won't work.

An example of the difference between NA and "NA":

x <- c( 1, 2, 3 , "NA" , 4 , 5 , NA )

> x[ !is.na(x) ]
[1] "1"  "2"  "3"  "NA" "4"  "5"

> x[ x == "NA" & !is.na(x) ]
[1] "NA"

Method to resolve this

I think you want to leave "NA" and any NAs as they are in the first df, but make all NA in the second df formed from rbind.fill() change to something like "NotAvailable". You can accomplish this like so...

df1 <- data.frame( col = rep( "NA" , 6 ) , x = 1:6 , z = rep( 1 , 6 ) )
df2 <- data.frame( col = rep( "SA" , 2 ) , x = 1:2 , y = 5:6 )
df <- rbind.fill( df1 , df2 )
temp <- df [ (colnames(df) %in% colnames(df2)) ]
temp[ is.na( temp ) ] <- "NotAvailable"
res <- cbind( temp , df[ !( colnames(df) %in% colnames(df2) ) ] )

#df has real NA values in column z and column y. We just want to get rid of y's
df

#     col x  z  y
#   1  NA 1  1 NA
#   2  NA 2  1 NA
#   3  NA 3  1 NA
#   4  NA 4  1 NA
#   5  NA 5  1 NA
#   6  NA 6  1 NA
#   7  SA 1 NA  5
#   8  SA 2 NA  6

#res has "NA" strings in col representing "North America" and NA values in z, whilst those in y have been removed
#More generally, any NA in df1 will be left 'as-is', whilst NA from df2 formed using rbind.fill will be converted to character string "NotAvilable"
res

#     col x            y  z
#   1  NA 1 NotAvailable  1
#   2  NA 2 NotAvailable  1
#   3  NA 3 NotAvailable  1
#   4  NA 4 NotAvailable  1
#   5  NA 5 NotAvailable  1
#   6  NA 6 NotAvailable  1
#   7  SA 1            5 NA
#   8  SA 2            6 NA