I have 3 dataframes called respectively: barometre2013, barometre2016, barometre2018.
I've already merge barometre2018 and barometre2016 like this:
baro1618 <- merge(barometre2016, barometre2018, all = TRUE)
All was good, I have all rows of the two dataframes and the columns names that are the same are merged in one with all rows of the tow dataframes. Exactly what I wanted.
The merged table looks like this:
names(baro1618)
[1] "q0qc" "regio" "sexe" "age" "langu" "q1a_1" "q1a_2" "q1a_3" "q1a_4" "q1a_5"
[11] "q1a_6" "q1a_7" "q1a_8" "q1a_9" "q1a_10" "q1b_1" "q1b_2" "q1b_3" "q1b_4" "q1b_5"
[21] "q1b_6" "q1b_7" "q1b_8" "q1b_9" "q1b_10"
NOW, my problem start here.
I want to merge baro1618 with barometre2013, but before doing that I have to lower case all the columns names because when I tried to merge without doing this, the columns in uppercase of barometre2013 that have the same name in lower case baro1618 weren't merged.
The df barometre2013 looks like this:
names(barometre2013)
[229] "POND" "Q1A_1" "Q1A_2" "Q1A_3" "Q1A_4" "Q1A_5" "Q1A_6" "Q1A_7" "Q1A_8" "Q1A_9" "Q1A_10" "Q1B_1"
[241] "Q1B_2" "Q1B_3" "Q1B_4" "Q1B_5" "Q1B_6" "Q1B_7" "Q1B_8" "Q1B_9" "Q1B_10" "Q5A_1" "Q5A_2" "Q5A_3"
So I've tried this two solutions to lower case (both works):
barometre2013 <- setnames(barometre2013, tolower(names(barometre2013)))
colnames(barometre2013) <- tolower(colnames(barometre2013))
The result:
[229] "pond" "q1a_1" "q1a_2" "q1a_3" "q1a_4" "q1a_5" "q1a_6" "q1a_7" "q1a_8" "q1a_9" "q1a_10" "q1b_1"
[241] "q1b_2" "q1b_3" "q1b_4" "q1b_5" "q1b_6" "q1b_7" "q1b_8" "q1b_9" "q1b_10" "q5a_1" "q5a_2" "q5a_3"
BUT, when I've tried to merge like this :
baro1118 <- merge(baro1618, barometre2013, all = TRUE)
It give me this error :
Error in fix.by(by.x, x) : 'by' must specify a uniquely valid column
I don't understand why it was working in the first example and not in this second one. I can't specify any columns because I have TOO much name columns that match and a lot that do not match.
It should be possible not to specify right ?
Also, I want to keep all the columns names that match and the ones that don't match of both df.
Sorry for this long explanation, but I really need answer and I've read a lot of Q/A on SO and didn't find my answer.