I have a document collection with following structure
uid, name
With a Index
db.Collection.createIndex({name: "text"})
It contains following data
1, iphone
2, iphóne
3, iphonë
4, iphónë
When I am doing text search for iphone
I am getting only two records, which is unexpected
actual output
--------------
1, iphone
2, iphóne
If I search for iphonë
db.Collection.find( { $text: { $search: "iphonë"} } );
I am getting
---------------------
3, iphonë
4, iphónë
But Actually I am expecting following output
db.Collection.find( { $text: { $search: "iphone"} } );
db.Collection.find( { $text: { $search: "iphónë"} } );
Expected output
------------------
1, iphone
2, iphóne
3, iphonë
4, iphónë
am I missing something here?
How can I get above expected outputs, with search of iphone
or iphónë
?
Since mongodb 3.2, text indexes are diacritic insensitive:
So the following query should work:
but it looks like there is a bug with dieresis ( ¨ ), even if it's caterorized as diacritic in unicode 8.0 list (issue on JIRA: SERVER-29918 )
Solution
since mongodb 3.4 you can use collation which allows you to perform this kind of query :
for example, to get your expected output, run the following query:
this will output:
in the collation,
strength
is the level of comparaison to perform