SOLR - Regex in Filter query

2019-08-02 13:43发布

问题:

I want to implement Regex in fq but never implemented it before.

I have the below value in a property and the fieldtype is "lowercase": Prop=company1@city1@state1@country1@senior analytical chemist, chicago

I want to filter the results based on the regex. The regex should match the above if "company1@city1@state1@country1@"+ regex to match chicago and analytical anywhere after last @ symbol.

My requirement is to match the exact values before last @ and then use regex to match the remaining strings as I want to do free text search only on the last part. I cant split the data into multiple columns as its a multi-valued field.

I tried the below regex in the code to match the string after last @. It works fine in the code but not sure how to implement same in SOLR

/([^@]+(?=.*IL)(?=.*chicago)(?=.*analytical))/ig 

Can someone please let me know how to use above regex with SOLR?

回答1:

Regular expressions in Solr is provided by searching with q=field:/regex/. This assumes that the field type in question is a string field (or at least a field with a KeywordTokenizer) as the matching happens on the token level (and if you have a analyzed field, it might be split into separate tokens and won't match the regex).

Something like q=field:/([^@]+(?=.*IL)(?=.*chicago)(?=.*analytical))/ could work, but the /i/ modifier indicates that you don't want to care about casing. I'd use a field with a KeywordTokenizer and a LowercaseFilter, and then use a lowercase regex to search:

<analyzer>
    <tokenizer class="solr.KeywordTokenizerFactory"/>           
    <filter class="solr.LowerCaseFilterFactory" />
</analyzer>

and to query:

q=field:/([^@]+(?=.*il)(?=.*chicago)(?=.*analytical))/