Converting R Column names into id variables

2020-03-29 06:44发布

问题:

I'm quite confused and haven't even been able to search for what I'm looking for. I have a multi-year survey on different countries, which is currently like this:

        Question  Year  CountryA  CountryB  ...  CountryZ
        1         1999       Yes        No             No 
        2         1999       Yes        Yes            Yes

That is, it's currently organized by question. I want to have the data arranged by country, year and question number as such:

Country  Year  Question  Answer
      A  1999         1     Yes
      A  1999         2     Yes
      B  1999         1      No
      B  1999         2     Yes

And so on. Is this even possible? I can't seem to find anything to guide me to the right answer.
Thanks in advance!

回答1:

The most direct approach is to use melt from "reshape2". Assuming your data.frame is called "mydf":

> library(reshape2)
> melt(mydf, id.vars=1:2)
  Question Year variable value
1        1 1999 CountryA   Yes
2        2 1999 CountryA   Yes
3        1 1999 CountryB    No
4        2 1999 CountryB   Yes
5        1 1999 CountryZ    No
6        2 1999 CountryZ   Yes

Update

My mind's not working on how to properly deal with the resulting names from base reshape, but you can also do something like this:

names(mydf) <- sub("Country", "Country.", names(mydf))
setNames(
  reshape(mydf, direction="long", idvar=1:2, varying=3:ncol(mydf)),
  c("Question", "Year", "Country", "Answer"))
#          Question Year Country Answer
# 1.1999.A        1 1999       A    Yes
# 2.1999.A        2 1999       A    Yes
# 1.1999.B        1 1999       B     No
# 2.1999.B        2 1999       B    Yes
# 1.1999.Z        1 1999       Z     No
# 2.1999.Z        2 1999       Z    Yes

Where:

mydf <- structure(list(Question = 1:2, Year = c(1999L, 1999L), CountryA = c("Yes", 
  "Yes"), CountryB = c("No", "Yes"), CountryZ = c("No", "Yes")), .Names = c("Question", 
  "Year", "CountryA", "CountryB", "CountryZ"), class = "data.frame", row.names = c(NA, -2L))


回答2:

Following the method of @Ananda

DF <- read.table(text="Question \t Year CountryA    CountryB    CountryZ
1   1999    Yes No  No
2   1999    Yes Yes Yes", sep="\t", header=T)

> DF
  Question Year CountryA CountryB CountryZ
1        1 1999      Yes       No       No
2        2 1999      Yes      Yes      Yes

DF <- melt(DF, id.vars=1:2, value.name="Answer", variable.name="Country")

> DF
  Question Year  Country Answer
1        1 1999 CountryA    Yes
2        2 1999 CountryA    Yes
3        1 1999 CountryB     No
4        2 1999 CountryB    Yes
5        1 1999 CountryZ     No
6        2 1999 CountryZ    Yes

Then it's just a matter of changing the levels of Country column...



标签: r reshape