Merging data - Error in fix.by(by.x, x)

2019-06-08 14:31发布

I am trying to merge data in R as suggested in an answer to my other post here. Yet, I have an error.

First let me explain what I try to do. I have 100 files (each have x_i and y_i), I want to merge them in this way:

from:

x1; y1  ; x2 ; y2
1 ; 100 ; 1  ; 150
4 ; 90  ; 2  ; 85
7 ; 85  ; 10 ; 60
10; 80  ;

to

x1; y1  ; x2 ; y2
1 ; 100 ; 1  ; 150
2 ; 100 ; 2  ; 85
4 ; 90  ; 4  ; 85
7 ; 85  ; 7  ; 85
10; 80  ;10 ; 60

The simple script works fine on the toy example:

xx <- read.table(text='x1; y1  ; x2 ; y2
1 ; 100 ; 1  ; 150
4 ; 90  ; 2  ; 85
7 ; 85  ; 10 ; 60
10; 80  ;',sep=';',fill=TRUE,header=TRUE)

dm <- merge(xx[,1:2],xx[,3:4],by=1,all=T)
dm <- dm[!is.na(dm$x1),]
dm$y1 <- zoo::na.locf(dm$y1)
dm$y2 <- zoo::na.locf(dm$y2)
dm
  x1  y1  y2
1  1 100 150
2  2 100  85
3  4  90  85
4  7  85  85
5 10  80  60

Now as for my real data. I modified the script to look like this but I get an error:

library(zoo)

data    1    = read.table("rundata  1", sep= " ", col.names=c("tm   1","score   1","current 1"))
data    2    = read.table("rundata  2", sep= " ", col.names=c("tm   2","score   2","current 2"))
data    3    = read.table("rundata  3", sep= " ", col.names=c("tm   3","score   3","current 3"))
data    4    = read.table("rundata  4", sep= " ", col.names=c("tm   4","score   4","current 4"))
data    5    = read.table("rundata  5", sep= " ", col.names=c("tm   5","score   5","current 5"))
data    6    = read.table("rundata  6", sep= " ", col.names=c("tm   6","score   6","current 6"))
data    7    = read.table("rundata  7", sep= " ", col.names=c("tm   7","score   7","current 7"))
data    8    = read.table("rundata  8", sep= " ", col.names=c("tm   8","score   8","current 8"))
data    9    = read.table("rundata  9", sep= " ", col.names=c("tm   9","score   9","current 9"))
data    10   = read.table("rundata  10", sep= " ", col.names=c("tm  10","score  10","current    10"))
data    11   = read.table("rundata  11", sep= " ", col.names=c("tm  11","score  11","current    11"))
data    12   = read.table("rundata  12", sep= " ", col.names=c("tm  12","score  12","current    12"))
data    13   = read.table("rundata  13", sep= " ", col.names=c("tm  13","score  13","current    13"))
data    14   = read.table("rundata  14", sep= " ", col.names=c("tm  14","score  14","current    14"))
data    15   = read.table("rundata  15", sep= " ", col.names=c("tm  15","score  15","current    15"))
data    16   = read.table("rundata  16", sep= " ", col.names=c("tm  16","score  16","current    16"))
data    17   = read.table("rundata  17", sep= " ", col.names=c("tm  17","score  17","current    17"))
data    18   = read.table("rundata  18", sep= " ", col.names=c("tm  18","score  18","current    18"))
data    19   = read.table("rundata  19", sep= " ", col.names=c("tm  19","score  19","current    19"))
data    20   = read.table("rundata  20", sep= " ", col.names=c("tm  20","score  20","current    20"))
data    21   = read.table("rundata  21", sep= " ", col.names=c("tm  21","score  21","current    21"))
data    22   = read.table("rundata  22", sep= " ", col.names=c("tm  22","score  22","current    22"))
data    23   = read.table("rundata  23", sep= " ", col.names=c("tm  23","score  23","current    23"))
data    24   = read.table("rundata  24", sep= " ", col.names=c("tm  24","score  24","current    24"))
data    25   = read.table("rundata  25", sep= " ", col.names=c("tm  25","score  25","current    25"))
data    26   = read.table("rundata  26", sep= " ", col.names=c("tm  26","score  26","current    26"))
data    27   = read.table("rundata  27", sep= " ", col.names=c("tm  27","score  27","current    27"))
data    28   = read.table("rundata  28", sep= " ", col.names=c("tm  28","score  28","current    28"))
data    29   = read.table("rundata  29", sep= " ", col.names=c("tm  29","score  29","current    29"))
data    30   = read.table("rundata  30", sep= " ", col.names=c("tm  30","score  30","current    30"))
data    31   = read.table("rundata  31", sep= " ", col.names=c("tm  31","score  31","current    31"))
data    32   = read.table("rundata  32", sep= " ", col.names=c("tm  32","score  32","current    32"))
data    33   = read.table("rundata  33", sep= " ", col.names=c("tm  33","score  33","current    33"))
data    34   = read.table("rundata  34", sep= " ", col.names=c("tm  34","score  34","current    34"))
data    35   = read.table("rundata  35", sep= " ", col.names=c("tm  35","score  35","current    35"))
data    36   = read.table("rundata  36", sep= " ", col.names=c("tm  36","score  36","current    36"))
data    37   = read.table("rundata  37", sep= " ", col.names=c("tm  37","score  37","current    37"))
data    38   = read.table("rundata  38", sep= " ", col.names=c("tm  38","score  38","current    38"))
data    39   = read.table("rundata  39", sep= " ", col.names=c("tm  39","score  39","current    39"))
data    40   = read.table("rundata  40", sep= " ", col.names=c("tm  40","score  40","current    40"))
data    41   = read.table("rundata  41", sep= " ", col.names=c("tm  41","score  41","current    41"))
data    42   = read.table("rundata  42", sep= " ", col.names=c("tm  42","score  42","current    42"))
data    43   = read.table("rundata  43", sep= " ", col.names=c("tm  43","score  43","current    43"))
data    44   = read.table("rundata  44", sep= " ", col.names=c("tm  44","score  44","current    44"))
data    45   = read.table("rundata  45", sep= " ", col.names=c("tm  45","score  45","current    45"))
data    46   = read.table("rundata  46", sep= " ", col.names=c("tm  46","score  46","current    46"))
data    47   = read.table("rundata  47", sep= " ", col.names=c("tm  47","score  47","current    47"))
data    48   = read.table("rundata  48", sep= " ", col.names=c("tm  48","score  48","current    48"))
data    49   = read.table("rundata  49", sep= " ", col.names=c("tm  49","score  49","current    49"))
data    50   = read.table("rundata  50", sep= " ", col.names=c("tm  50","score  50","current    50"))
data    51   = read.table("rundata  51", sep= " ", col.names=c("tm  51","score  51","current    51"))
data    52   = read.table("rundata  52", sep= " ", col.names=c("tm  52","score  52","current    52"))
data    53   = read.table("rundata  53", sep= " ", col.names=c("tm  53","score  53","current    53"))
data    54   = read.table("rundata  54", sep= " ", col.names=c("tm  54","score  54","current    54"))
data    55   = read.table("rundata  55", sep= " ", col.names=c("tm  55","score  55","current    55"))
data    56   = read.table("rundata  56", sep= " ", col.names=c("tm  56","score  56","current    56"))
data    57   = read.table("rundata  57", sep= " ", col.names=c("tm  57","score  57","current    57"))
data    58   = read.table("rundata  58", sep= " ", col.names=c("tm  58","score  58","current    58"))
data    59   = read.table("rundata  59", sep= " ", col.names=c("tm  59","score  59","current    59"))
data    60   = read.table("rundata  60", sep= " ", col.names=c("tm  60","score  60","current    60"))
data    61   = read.table("rundata  61", sep= " ", col.names=c("tm  61","score  61","current    61"))
data    62   = read.table("rundata  62", sep= " ", col.names=c("tm  62","score  62","current    62"))
data    63   = read.table("rundata  63", sep= " ", col.names=c("tm  63","score  63","current    63"))
data    64   = read.table("rundata  64", sep= " ", col.names=c("tm  64","score  64","current    64"))
data    65   = read.table("rundata  65", sep= " ", col.names=c("tm  65","score  65","current    65"))
data    66   = read.table("rundata  66", sep= " ", col.names=c("tm  66","score  66","current    66"))
data    67   = read.table("rundata  67", sep= " ", col.names=c("tm  67","score  67","current    67"))
data    68   = read.table("rundata  68", sep= " ", col.names=c("tm  68","score  68","current    68"))
data    69   = read.table("rundata  69", sep= " ", col.names=c("tm  69","score  69","current    69"))
data    70   = read.table("rundata  70", sep= " ", col.names=c("tm  70","score  70","current    70"))
data    71   = read.table("rundata  71", sep= " ", col.names=c("tm  71","score  71","current    71"))
data    72   = read.table("rundata  72", sep= " ", col.names=c("tm  72","score  72","current    72"))
data    73   = read.table("rundata  73", sep= " ", col.names=c("tm  73","score  73","current    73"))
data    74   = read.table("rundata  74", sep= " ", col.names=c("tm  74","score  74","current    74"))
data    75   = read.table("rundata  75", sep= " ", col.names=c("tm  75","score  75","current    75"))
data    76   = read.table("rundata  76", sep= " ", col.names=c("tm  76","score  76","current    76"))
data    77   = read.table("rundata  77", sep= " ", col.names=c("tm  77","score  77","current    77"))
data    78   = read.table("rundata  78", sep= " ", col.names=c("tm  78","score  78","current    78"))
data    79   = read.table("rundata  79", sep= " ", col.names=c("tm  79","score  79","current    79"))
data    80   = read.table("rundata  80", sep= " ", col.names=c("tm  80","score  80","current    80"))
data    81   = read.table("rundata  81", sep= " ", col.names=c("tm  81","score  81","current    81"))
data    82   = read.table("rundata  82", sep= " ", col.names=c("tm  82","score  82","current    82"))
data    83   = read.table("rundata  83", sep= " ", col.names=c("tm  83","score  83","current    83"))
data    84   = read.table("rundata  84", sep= " ", col.names=c("tm  84","score  84","current    84"))
data    85   = read.table("rundata  85", sep= " ", col.names=c("tm  85","score  85","current    85"))
data    86   = read.table("rundata  86", sep= " ", col.names=c("tm  86","score  86","current    86"))
data    87   = read.table("rundata  87", sep= " ", col.names=c("tm  87","score  87","current    87"))
data    88   = read.table("rundata  88", sep= " ", col.names=c("tm  88","score  88","current    88"))
data    89   = read.table("rundata  89", sep= " ", col.names=c("tm  89","score  89","current    89"))
data    90   = read.table("rundata  90", sep= " ", col.names=c("tm  90","score  90","current    90"))
data    91   = read.table("rundata  91", sep= " ", col.names=c("tm  91","score  91","current    91"))
data    92   = read.table("rundata  92", sep= " ", col.names=c("tm  92","score  92","current    92"))
data    93   = read.table("rundata  93", sep= " ", col.names=c("tm  93","score  93","current    93"))
data    94   = read.table("rundata  94", sep= " ", col.names=c("tm  94","score  94","current    94"))
data    95   = read.table("rundata  95", sep= " ", col.names=c("tm  95","score  95","current    95"))
data    96   = read.table("rundata  96", sep= " ", col.names=c("tm  96","score  96","current    96"))
data    97   = read.table("rundata  97", sep= " ", col.names=c("tm  97","score  97","current    97"))
data    98   = read.table("rundata  98", sep= " ", col.names=c("tm  98","score  98","current    98"))
data    99   = read.table("rundata  99", sep= " ", col.names=c("tm  99","score  99","current    99"))
data    100  = read.table("rundata  100", sep= " ", col.names=c("tm 100","score 100","current   100"))

-> works fine

newdata<- merge(    
data1   [,1:2],
data2   [,1:2],
data3   [,1:2],
data4   [,1:2],
data5   [,1:2],
data6   [,1:2],
data7   [,1:2],
data8   [,1:2],
data9   [,1:2],
data10  [,1:2],
data11  [,1:2],
data12  [,1:2],
data13  [,1:2],
data14  [,1:2],
data15  [,1:2],
data16  [,1:2],
data17  [,1:2],
data18  [,1:2],
data19  [,1:2],
data20  [,1:2],
data21  [,1:2],
data22  [,1:2],
data23  [,1:2],
data24  [,1:2],
data25  [,1:2],
data26  [,1:2],
data27  [,1:2],
data28  [,1:2],
data29  [,1:2],
data30  [,1:2],
data31  [,1:2],
data32  [,1:2],
data33  [,1:2],
data34  [,1:2],
data35  [,1:2],
data36  [,1:2],
data37  [,1:2],
data38  [,1:2],
data39  [,1:2],
data40  [,1:2],
data41  [,1:2],
data42  [,1:2],
data43  [,1:2],
data44  [,1:2],
data45  [,1:2],
data46  [,1:2],
data47  [,1:2],
data48  [,1:2],
data49  [,1:2],
data50  [,1:2],
data51  [,1:2],
data52  [,1:2],
data53  [,1:2],
data54  [,1:2],
data55  [,1:2],
data56  [,1:2],
data57  [,1:2],
data58  [,1:2],
data59  [,1:2],
data60  [,1:2],
data61  [,1:2],
data62  [,1:2],
data63  [,1:2],
data64  [,1:2],
data65  [,1:2],
data66  [,1:2],
data67  [,1:2],
data68  [,1:2],
data69  [,1:2],
data70  [,1:2],
data71  [,1:2],
data72  [,1:2],
data73  [,1:2],
data74  [,1:2],
data75  [,1:2],
data76  [,1:2],
data77  [,1:2],
data78  [,1:2],
data79  [,1:2],
data80  [,1:2],
data81  [,1:2],
data82  [,1:2],
data83  [,1:2],
data84  [,1:2],
data85  [,1:2],
data86  [,1:2],
data87  [,1:2],
data88  [,1:2],
data89  [,1:2],
data90  [,1:2],
data91  [,1:2],
data92  [,1:2],
data93  [,1:2],
data94  [,1:2],
data95  [,1:2],
data96  [,1:2],
data97  [,1:2],
data98  [,1:2],
data99  [,1:2],
data100 [,1:2],
by=1,all=T) 

-> gives an error:

Error in fix.by(by.x, x) : 
  'by' must specify one or more columns as numbers, names or logical

I don't understand this error, since I do indicate 1 no?

Rest of the script (remains to be tested on 100 inputs after I fix the first error)

newdata <- newdata[!is.na(newdata$tm1   ),]
newdata <- newdata[!is.na(newdata$tm2   ),]
newdata <- newdata[!is.na(newdata$tm3   ),]
newdata <- newdata[!is.na(newdata$tm4   ),]
newdata <- newdata[!is.na(newdata$tm5   ),]
newdata <- newdata[!is.na(newdata$tm6   ),]
newdata <- newdata[!is.na(newdata$tm7   ),]
newdata <- newdata[!is.na(newdata$tm8   ),]
newdata <- newdata[!is.na(newdata$tm9   ),]
newdata <- newdata[!is.na(newdata$tm10  ),]
newdata <- newdata[!is.na(newdata$tm11  ),]
newdata <- newdata[!is.na(newdata$tm12  ),]
newdata <- newdata[!is.na(newdata$tm13  ),]
newdata <- newdata[!is.na(newdata$tm14  ),]
newdata <- newdata[!is.na(newdata$tm15  ),]
newdata <- newdata[!is.na(newdata$tm16  ),]
newdata <- newdata[!is.na(newdata$tm17  ),]
newdata <- newdata[!is.na(newdata$tm18  ),]
newdata <- newdata[!is.na(newdata$tm19  ),]
newdata <- newdata[!is.na(newdata$tm20  ),]
newdata <- newdata[!is.na(newdata$tm21  ),]
newdata <- newdata[!is.na(newdata$tm22  ),]
newdata <- newdata[!is.na(newdata$tm23  ),]
newdata <- newdata[!is.na(newdata$tm24  ),]
newdata <- newdata[!is.na(newdata$tm25  ),]
newdata <- newdata[!is.na(newdata$tm26  ),]
newdata <- newdata[!is.na(newdata$tm27  ),]
newdata <- newdata[!is.na(newdata$tm28  ),]
newdata <- newdata[!is.na(newdata$tm29  ),]
newdata <- newdata[!is.na(newdata$tm30  ),]
newdata <- newdata[!is.na(newdata$tm31  ),]
newdata <- newdata[!is.na(newdata$tm32  ),]
newdata <- newdata[!is.na(newdata$tm33  ),]
newdata <- newdata[!is.na(newdata$tm34  ),]
newdata <- newdata[!is.na(newdata$tm35  ),]
newdata <- newdata[!is.na(newdata$tm36  ),]
newdata <- newdata[!is.na(newdata$tm37  ),]
newdata <- newdata[!is.na(newdata$tm38  ),]
newdata <- newdata[!is.na(newdata$tm39  ),]
newdata <- newdata[!is.na(newdata$tm40  ),]
newdata <- newdata[!is.na(newdata$tm41  ),]
newdata <- newdata[!is.na(newdata$tm42  ),]
newdata <- newdata[!is.na(newdata$tm43  ),]
newdata <- newdata[!is.na(newdata$tm44  ),]
newdata <- newdata[!is.na(newdata$tm45  ),]
newdata <- newdata[!is.na(newdata$tm46  ),]
newdata <- newdata[!is.na(newdata$tm47  ),]
newdata <- newdata[!is.na(newdata$tm48  ),]
newdata <- newdata[!is.na(newdata$tm49  ),]
newdata <- newdata[!is.na(newdata$tm50  ),]
newdata <- newdata[!is.na(newdata$tm51  ),]
newdata <- newdata[!is.na(newdata$tm52  ),]
newdata <- newdata[!is.na(newdata$tm53  ),]
newdata <- newdata[!is.na(newdata$tm54  ),]
newdata <- newdata[!is.na(newdata$tm55  ),]
newdata <- newdata[!is.na(newdata$tm56  ),]
newdata <- newdata[!is.na(newdata$tm57  ),]
newdata <- newdata[!is.na(newdata$tm58  ),]
newdata <- newdata[!is.na(newdata$tm59  ),]
newdata <- newdata[!is.na(newdata$tm60  ),]
newdata <- newdata[!is.na(newdata$tm61  ),]
newdata <- newdata[!is.na(newdata$tm62  ),]
newdata <- newdata[!is.na(newdata$tm63  ),]
newdata <- newdata[!is.na(newdata$tm64  ),]
newdata <- newdata[!is.na(newdata$tm65  ),]
newdata <- newdata[!is.na(newdata$tm66  ),]
newdata <- newdata[!is.na(newdata$tm67  ),]
newdata <- newdata[!is.na(newdata$tm68  ),]
newdata <- newdata[!is.na(newdata$tm69  ),]
newdata <- newdata[!is.na(newdata$tm70  ),]
newdata <- newdata[!is.na(newdata$tm71  ),]
newdata <- newdata[!is.na(newdata$tm72  ),]
newdata <- newdata[!is.na(newdata$tm73  ),]
newdata <- newdata[!is.na(newdata$tm74  ),]
newdata <- newdata[!is.na(newdata$tm75  ),]
newdata <- newdata[!is.na(newdata$tm76  ),]
newdata <- newdata[!is.na(newdata$tm77  ),]
newdata <- newdata[!is.na(newdata$tm78  ),]
newdata <- newdata[!is.na(newdata$tm79  ),]
newdata <- newdata[!is.na(newdata$tm80  ),]
newdata <- newdata[!is.na(newdata$tm81  ),]
newdata <- newdata[!is.na(newdata$tm82  ),]
newdata <- newdata[!is.na(newdata$tm83  ),]
newdata <- newdata[!is.na(newdata$tm84  ),]
newdata <- newdata[!is.na(newdata$tm85  ),]
newdata <- newdata[!is.na(newdata$tm86  ),]
newdata <- newdata[!is.na(newdata$tm87  ),]
newdata <- newdata[!is.na(newdata$tm88  ),]
newdata <- newdata[!is.na(newdata$tm89  ),]
newdata <- newdata[!is.na(newdata$tm90  ),]
newdata <- newdata[!is.na(newdata$tm91  ),]
newdata <- newdata[!is.na(newdata$tm92  ),]
newdata <- newdata[!is.na(newdata$tm93  ),]
newdata <- newdata[!is.na(newdata$tm94  ),]
newdata <- newdata[!is.na(newdata$tm95  ),]
newdata <- newdata[!is.na(newdata$tm96  ),]
newdata <- newdata[!is.na(newdata$tm97  ),]
newdata <- newdata[!is.na(newdata$tm98  ),]
newdata <- newdata[!is.na(newdata$tm99  ),]
newdata <- newdata[!is.na(newdata$tm100 ),]



newdata$score1   <- zoo::na.locf(newdata$score1 )
newdata$score2   <- zoo::na.locf(newdata$score2 )
newdata$score3   <- zoo::na.locf(newdata$score3 )
newdata$score4   <- zoo::na.locf(newdata$score4 )
newdata$score5   <- zoo::na.locf(newdata$score5 )
newdata$score6   <- zoo::na.locf(newdata$score6 )
newdata$score7   <- zoo::na.locf(newdata$score7 )
newdata$score8   <- zoo::na.locf(newdata$score8 )
newdata$score9   <- zoo::na.locf(newdata$score9 )
newdata$score10  <- zoo::na.locf(newdata$score10    )
newdata$score11  <- zoo::na.locf(newdata$score11    )
newdata$score12  <- zoo::na.locf(newdata$score12    )
newdata$score13  <- zoo::na.locf(newdata$score13    )
newdata$score14  <- zoo::na.locf(newdata$score14    )
newdata$score15  <- zoo::na.locf(newdata$score15    )
newdata$score16  <- zoo::na.locf(newdata$score16    )
newdata$score17  <- zoo::na.locf(newdata$score17    )
newdata$score18  <- zoo::na.locf(newdata$score18    )
newdata$score19  <- zoo::na.locf(newdata$score19    )
newdata$score20  <- zoo::na.locf(newdata$score20    )
newdata$score21  <- zoo::na.locf(newdata$score21    )
newdata$score22  <- zoo::na.locf(newdata$score22    )
newdata$score23  <- zoo::na.locf(newdata$score23    )
newdata$score24  <- zoo::na.locf(newdata$score24    )
newdata$score25  <- zoo::na.locf(newdata$score25    )
newdata$score26  <- zoo::na.locf(newdata$score26    )
newdata$score27  <- zoo::na.locf(newdata$score27    )
newdata$score28  <- zoo::na.locf(newdata$score28    )
newdata$score29  <- zoo::na.locf(newdata$score29    )
newdata$score30  <- zoo::na.locf(newdata$score30    )
newdata$score31  <- zoo::na.locf(newdata$score31    )
newdata$score32  <- zoo::na.locf(newdata$score32    )
newdata$score33  <- zoo::na.locf(newdata$score33    )
newdata$score34  <- zoo::na.locf(newdata$score34    )
newdata$score35  <- zoo::na.locf(newdata$score35    )
newdata$score36  <- zoo::na.locf(newdata$score36    )
newdata$score37  <- zoo::na.locf(newdata$score37    )
newdata$score38  <- zoo::na.locf(newdata$score38    )
newdata$score39  <- zoo::na.locf(newdata$score39    )
newdata$score40  <- zoo::na.locf(newdata$score40    )
newdata$score41  <- zoo::na.locf(newdata$score41    )
newdata$score42  <- zoo::na.locf(newdata$score42    )
newdata$score43  <- zoo::na.locf(newdata$score43    )
newdata$score44  <- zoo::na.locf(newdata$score44    )
newdata$score45  <- zoo::na.locf(newdata$score45    )
newdata$score46  <- zoo::na.locf(newdata$score46    )
newdata$score47  <- zoo::na.locf(newdata$score47    )
newdata$score48  <- zoo::na.locf(newdata$score48    )
newdata$score49  <- zoo::na.locf(newdata$score49    )
newdata$score50  <- zoo::na.locf(newdata$score50    )
newdata$score51  <- zoo::na.locf(newdata$score51    )
newdata$score52  <- zoo::na.locf(newdata$score52    )
newdata$score53  <- zoo::na.locf(newdata$score53    )
newdata$score54  <- zoo::na.locf(newdata$score54    )
newdata$score55  <- zoo::na.locf(newdata$score55    )
newdata$score56  <- zoo::na.locf(newdata$score56    )
newdata$score57  <- zoo::na.locf(newdata$score57    )
newdata$score58  <- zoo::na.locf(newdata$score58    )
newdata$score59  <- zoo::na.locf(newdata$score59    )
newdata$score60  <- zoo::na.locf(newdata$score60    )
newdata$score61  <- zoo::na.locf(newdata$score61    )
newdata$score62  <- zoo::na.locf(newdata$score62    )
newdata$score63  <- zoo::na.locf(newdata$score63    )
newdata$score64  <- zoo::na.locf(newdata$score64    )
newdata$score65  <- zoo::na.locf(newdata$score65    )
newdata$score66  <- zoo::na.locf(newdata$score66    )
newdata$score67  <- zoo::na.locf(newdata$score67    )
newdata$score68  <- zoo::na.locf(newdata$score68    )
newdata$score69  <- zoo::na.locf(newdata$score69    )
newdata$score70  <- zoo::na.locf(newdata$score70    )
newdata$score71  <- zoo::na.locf(newdata$score71    )
newdata$score72  <- zoo::na.locf(newdata$score72    )
newdata$score73  <- zoo::na.locf(newdata$score73    )
newdata$score74  <- zoo::na.locf(newdata$score74    )
newdata$score75  <- zoo::na.locf(newdata$score75    )
newdata$score76  <- zoo::na.locf(newdata$score76    )
newdata$score77  <- zoo::na.locf(newdata$score77    )
newdata$score78  <- zoo::na.locf(newdata$score78    )
newdata$score79  <- zoo::na.locf(newdata$score79    )
newdata$score80  <- zoo::na.locf(newdata$score80    )
newdata$score81  <- zoo::na.locf(newdata$score81    )
newdata$score82  <- zoo::na.locf(newdata$score82    )
newdata$score83  <- zoo::na.locf(newdata$score83    )
newdata$score84  <- zoo::na.locf(newdata$score84    )
newdata$score85  <- zoo::na.locf(newdata$score85    )
newdata$score86  <- zoo::na.locf(newdata$score86    )
newdata$score87  <- zoo::na.locf(newdata$score87    )
newdata$score88  <- zoo::na.locf(newdata$score88    )
newdata$score89  <- zoo::na.locf(newdata$score89    )
newdata$score90  <- zoo::na.locf(newdata$score90    )
newdata$score91  <- zoo::na.locf(newdata$score91    )
newdata$score92  <- zoo::na.locf(newdata$score92    )
newdata$score93  <- zoo::na.locf(newdata$score93    )
newdata$score94  <- zoo::na.locf(newdata$score94    )
newdata$score95  <- zoo::na.locf(newdata$score95    )
newdata$score96  <- zoo::na.locf(newdata$score96    )
newdata$score97  <- zoo::na.locf(newdata$score97    )
newdata$score98  <- zoo::na.locf(newdata$score98    )
newdata$score99  <- zoo::na.locf(newdata$score99    )
newdata$score100     <- zoo::na.locf(newdata$score100   )

write.table(newdata, "outputR")

I would be grateful if anybody could help me fix the "by=1" error. Here are my datafiles. You might also notice I am not used to using loops in R, so I just copied everything 100 times, probably not the easiest way.

--UPDATE

Oh, I am actually thinking this is because merge only accepts two arguments, so I should use:

newdata<- merge(data1[,1:2],data2[,1:2],by=1,all=TRUE)
newdata<- merge(newdata[,1:3],data3[,1:2],by=1,all=TRUE)

for each of the elements...

2条回答
一夜七次
2楼-- · 2019-06-08 14:53

The merge argument only takes two values as input, so you have to do them separately:

newdata<- merge(data1[,1:2],data2[,1:2],by=1,all=TRUE)
newdata<- merge(newdata [,1:3   ],data3 [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:4   ],data4 [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:5   ],data5 [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:6   ],data6 [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:7   ],data7 [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:8   ],data8 [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:9   ],data9 [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:10  ],data10    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:11  ],data11    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:12  ],data12    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:13  ],data13    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:14  ],data14    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:15  ],data15    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:16  ],data16    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:17  ],data17    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:18  ],data18    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:19  ],data19    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:20  ],data20    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:21  ],data21    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:22  ],data22    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:23  ],data23    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:24  ],data24    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:25  ],data25    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:26  ],data26    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:27  ],data27    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:28  ],data28    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:29  ],data29    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:30  ],data30    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:31  ],data31    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:32  ],data32    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:33  ],data33    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:34  ],data34    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:35  ],data35    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:36  ],data36    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:37  ],data37    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:38  ],data38    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:39  ],data39    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:40  ],data40    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:41  ],data41    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:42  ],data42    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:43  ],data43    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:44  ],data44    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:45  ],data45    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:46  ],data46    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:47  ],data47    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:48  ],data48    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:49  ],data49    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:50  ],data50    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:51  ],data51    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:52  ],data52    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:53  ],data53    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:54  ],data54    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:55  ],data55    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:56  ],data56    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:57  ],data57    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:58  ],data58    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:59  ],data59    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:60  ],data60    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:61  ],data61    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:62  ],data62    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:63  ],data63    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:64  ],data64    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:65  ],data65    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:66  ],data66    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:67  ],data67    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:68  ],data68    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:69  ],data69    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:70  ],data70    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:71  ],data71    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:72  ],data72    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:73  ],data73    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:74  ],data74    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:75  ],data75    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:76  ],data76    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:77  ],data77    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:78  ],data78    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:79  ],data79    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:80  ],data80    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:81  ],data81    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:82  ],data82    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:83  ],data83    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:84  ],data84    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:85  ],data85    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:86  ],data86    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:87  ],data87    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:88  ],data88    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:89  ],data89    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:90  ],data90    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:91  ],data91    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:92  ],data92    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:93  ],data93    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:94  ],data94    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:95  ],data95    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:96  ],data96    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:97  ],data97    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:98  ],data98    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:99  ],data99    [,1:2],by=1,all=TRUE        )
newdata<- merge(newdata [,1:100 ],data100   [,1:2],by=1,all=TRUE        )
查看更多
劫难
3楼-- · 2019-06-08 14:53

Google returns this as the first listing for the R error, Error in fix.by(by.x, x) and I found it interesting no answer attempted a loop solution of the OP's very long command listing.

For future readers needing to merge many dataframes, consider binding individual dataframes into a list with lapply(), run any needed calculations, and then run a Reduce(..., merge) to merge all files of list into one wide dataframe. Below processes and merges all 100 files of original posting:

library(zoo)

dfList <- lapply(c(1:100), function(i) {
   df <- read.table(paste0("rundata  ", i), sep= " ", col.names=c("tm","score","current"))  
   df <- df[!is.na(df$tm),]
   df$score <- zoo::na.locf(df$score)
   colnames(df) <- paste0(colnames(df), i)
   return(df)
})

newdata <- Reduce(function(...) merge(..., by=1, all=T), dfList)

write.table(newdata, "outputR")
查看更多
登录 后发表回答