I have df1 like this:
df1 <- data.frame(A=c("x01","x02","y03","z02","x04"), B=c("A01BB01","A02BB02","C02AA05","B04CC10","C01GX02"))
A B
1 x01 A01BB01
2 x02 A02BB02
3 y03 C02AA05
4 z02 B04CC10
5 x04 C01GX02
I have df2 like this.
X Y
1 a A01BB
2 b A02
3 c C02A
4 d B04
5 e C01GX
df2 <- data.frame(X=c("a","b","c","d","e"), Y=c("A01BB","A02","C02A","B04","C01GX"))
I want to match the first few letters/ numbers in df1$B with those in df2$Y. And then merge two dataframe based on the best match, as such, we expect to see a results data frame like this:
A B X Y
1 x01 A01BB01 a A01BB
2 x02 A02BB02 b A02
3 y03 C02AA05 c C02A
4 z02 B04CC10 d B04
5 x04 C01GX02 e C01GX
Could you mind to teach me how to do so? Thanks.
the Matching could only happens in the first few letters/number, the matched portion could not appear in the middle or the end of the words in df1$B, are there any effective way of doing this with R?
You can use
pmatch
for this kind of matching: