I'm using elasticsearch to query data that originally was exported out of several relational databases that had a lot of redundencies. I now want to perform queries where I have a primary attribute and one or more secondary attributes that should match. I tried using a bool query with a must term and a should term, but that doesn't seem to work for my case, which may look like this:
Example:
I have a document with fullname
and street name
of a user and I want to search for similiar users in different indices. So the best match for my query should be the best match on fullname
and best match on streetname
field. But since the original data has a lot of redundencies and inconsistencies the field fullname
(which I manually created out of fields name1, name2, name3) may contain the same name multiple times and it seems that elasticsearch ranks a double match in a must field higher than a match in a should attribute.
That means, I want to query for John Doe
Back Street
with the following sample data:
{
"fullname" : "John Doe John and Jane",
"street" : "Main Street"
}
{
"fullname" : "John Doe",
"street" : "Back Street"
}
Long story short, I want to query for a main attribute fullname - John Doe
and secondary attribute street - Back Street
and want the second document to be the best match and not the first because it contains John
multiple times.