Is there a (easy) possibility to identify a common pattern which two strings share? Here is a little example to make clear what I mean:
I have two variables containing a string. Both include the same pattern ("ABC") and also some "noise".
a <- "xxxxxxxxxxxABCxxxxxxxxxxxx"
b <- "yyyyyyyyyyyyyyyyyyyyyyyABC"
Lets say I don't know the common pattern and I want R to find out that both strings contain "ABC". How can I do this?
*edit
The first example was maybe a bit to simplistic. Here is a example from my real data.
a <- "DUISBURG-HAMBORNS"
b <- "DUISBURG (-31.7.29)S"
Both strings contain "DUISBURG" which I want the function to identify.
*edit
I took the solution proposed in the link posted in the comments. But I still have not exactly what I want.
library(qualV)
LCS(strsplit(a[1], '')[[1]],strsplit(b[1], '')[[1]])$LCS
[1] "D" "U" "I" "S" "B" "U" "R" "G" "-" " " " " "S"
If the function is looking for the longest common subsequence of the two vectors, why does it not stop after "D" "U" "I" "S" "B" "U" "R" "G"
? .