Loading data issues

2019-03-05 11:04发布

Datalink: Data

Code:

 ccfsisims <- read.csv(file = "F:/Purdue University/RA_Position/PhD_ResearchandDissert/PhD_Draft/GTAP-CGE/GTAP_NewAggDatabase/NewFiles/GTAP_ConsIndex.csv", header=TRUE, sep=",", na.string="NA", dec=".", strip.white=TRUE)
 ccfsirsts <- as.data.frame(ccfsisims)
 ccfsirsts[7:25] <- sapply(ccfsirsts[7:25],as.numeric)
 ccfsirsts <- droplevels(ccfsirsts)
 ccfsirsts <- transform(ccfsirsts,sres=factor(sres,levels=unique(sres)))
 ccfsirsts[1:5,]

Issue:

So, if you check the column "pSVIPM", the values displayed in the dataframe "ccfsirsts" are different from what is actually saved in the .csv file. This problem occured when uploading a different set of data.

In the initial upload, i.e. "ccfsisims", everything seems to check out. It is afterward that the problem occurs.

Any thoughts on why this happens?

1条回答
劫难
2楼-- · 2019-03-05 11:19

when you load ccfsisims do str(ccfsisims )...(get in the habit of doing this)

you will see that pSVIPM is a factor. So as.numeric will simply change the factors to numbers in the order the levels appear.

Because if you look at your csv you have #DIV/0! characters in there.

try it yourself:

> length(ccfsisims$pSVIPM[ccfsisims$pSVIPM == "#DIV/0!"])
[1] 350
查看更多
登录 后发表回答