I'am searching for the technical word "AN-XYZ99". So I use
SELECT *
FROM foo
WHERE CONTAINS(bar, 'AN{-}XYZ99') > 0
but I get also results like "FO-XYZ99" or "BAR-XYZ99". What can I do to ensure the expected result?
I used
BEGIN
CTX_DDL.CREATE_PREFERENCE('FOO','BASIC_LEXER');
CTX_DDL.SET_ATTRIBUTE('FOO', 'ALTERNATE_SPELLING', 'GERMAN');
CTX_DDL.SET_ATTRIBUTE('FOO', 'COMPOSITE', 'GERMAN');
CTX_DDL.SET_ATTRIBUTE('FOO', 'MIXED_CASE', 'NO');
END;
Sample data from column "bar" (VARCHAR2(4000)):
"unbekannt Stadt Text: AN-XYZ99 << foobar Straße 31.12.2017 Datum Host 20160101 foo"
"unbekannt Stadt Text: FO-XYZ99 << foobar Straße 31.12.2017 Datum Host 20160101 bar"
"unbekannt Stadt Text: BAR-XYZ99 << foobar Straße 31.12.2017 Datum Host 20160101 bla"
With the Statement above I would like the first row as output but I get the second and third row as well.
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
First you must define hyphen as a
printjoin
in your lexer.check it with
Then you may (after re-creating index with this lexer) validate that the tokens are as expected: (your table would vary based on the index name; check all tables like 'DR$%$I')
Now you may query for the search string.
Aparently you must escape the hyphen as
BAR-XYZ99
will find rows withBAR
not containingXYZ99
; although the documentation ofhyphen with no space
is a bit different.For some reason (I'm on 11.2.0.2.0) the escaping with curly braces doesn't work (returns no match), but using backslash is fine.