Let's say I have a data table called YC
that looks like this:
Categories: colsums: tillTF:
ID: cat NA 0
MA NA 0
spayed NA 0
declawed NA 0
black NA 0
3 NA 0
no 57 1
claws NA 0
calico NA 0
4 NA 0
no 42 1
striped NA 0
0.5 NA 0
yes 84 1
not fixed NA 0
declawed NA 0
black NA 0
0.2 NA 0
yes 19 1
0.2 NA 0
yes 104 1
NH NA 0
spayed NA 0
claws NA 0
striped NA 0
12 NA 0
no 17 1
black NA 0
4 NA 0
yes 65 1
ID: DOG NA 0
MA NA 0
...
Only it's 1) not actually pivot table, it's inconsistently formatted to look like one and 2) the data is much more complicated, and was entered inconstantly over the course of a few decades. The only assumption that can be safely made about the data is that there are 12 variables associated with each record, and they are always entered in the same order.
My goal is to parse this data so that each attribute and associated numeric record are in in appropriate columns in a single row, like this:
Cat MA spayed declawed black 3 no 57
Cat MA spayed claws calico 0.5 no 42
Cat MA not fixed declawed black 0.2 yes 19
Cat MA not fixed declawed black 0.2 yes 104
Cat NH spayed claws striped 12 no 17
Cat NH spayed claws black 4 yes 65
Dog MA ....
I've written a for loop which identifies a "record" and then re-writes values in an array by reading backwards up the column in the data table until another "record" is reached. I'm new to R, and so wrote out my ideal loop without knowing whether it was possible.
array<-rep(0, length(7))
for (i in 1:7)
if(YC$tillTF[i]==1){
array[7]<-(YC$colsums[i])
array[6]<-(YC$Categories[i])
array[5]<-(YC$Categories[i-1])
array[4]<-(YC$Categories[i-2])
array[3]<-(YC$Categories[i-3])
array[2]<-(YC$Categories[i-4])
array[1]<-(YC$Categories[i-5])
}
YC_NT<-rbind(array)
Once array
is filled in, I want to loop through YC
and create a new row in YC_NT
for each unique record:
for (i in 8:length(YC$tillTF))
if (YC$tillTF[i]==1){
array[8]<-(YC$colsums[i])
array[7]<-(YC$Categories[i])
if (YC$tillTF[i-1]==0){
array[6]<-YC$Categories[i-1]
}else{
rbind(array, YC_NT)}
if (YC$tillTF[i-2]==0){
array[5]<-YC$Categories[i-2]
}else{
rbind(array, YC_NT)}
if(YC$tillTF[i-3]==0){
array[4]<-YC$Categories[i-3]
}else{
rbind(array, YC_NT)}
if(YC$tillTF[i-4]==0){
array[3]<-YC$Categories[i-4]
}else{
rbind(array, YC_NT)}
if(YC$tillTF[i-5]==0){
array[2]<-YC$Categories[i-5]
}else{
rbind(array, YC_NT)}
if(YC$tillTF[i-6]==0){
array[1]<-YC$Categories[i-6]
}else{
rbind(array, YC_NT)}
}else{
array<-array}
When I run this loop within a function on my data, I'm getting my YC_NT
data table back containing a single row. After spending a few days searching, I don't know that there is an R function which would be able to add the vector array
to last row of a data table without giving it a unique name every time. My questions:
1) Is there a function that would add a vector called array
to the end of a data table without re-writing a previous row called array
?
2) If no such function exists, how could I create a new name for array
every time my for loop reached a new numeric record?
Thanks for your help,