How to use regex for querying in Solr 4

2019-02-18 14:52发布

问题:

I've reached the point of desperation, so I'm asking for help. I'm trying to query results from a Solr 4 engine using regex.

Let's asume the document I want to query is:

<str name="text">description: best company; name: roca mola</str>

And I want to query using this regex:

description:(.*)?company(.*)?;

I read in some forums that using regex in Solr 4 was as easy as adding slashes, like:

localhost:8080/solr/q=text:/description\:(.*)?company(.*)?;/

but it isn't working. And this one doesn't work either:

localhost:8080/solr/q=text:/description(.*)?company(.*)?;/

I don't want a simple query like:

localhost:8080/solr/q=text:*company*

Since that would mismatch documents like:

<str name="text">description: my home; name: mother company"</str>

If I'm not clear please let me know.

Cheers from Chile :D

NOTE: I was using text_general fields on my scheme. As @arun pointed out, string fields can handle the type of regex I'm using.

回答1:

Instead of trying regex search on text field type, try it on a string field type, since your regex is spanning more than one word. (If your regex needs to match a single word, then you can use a text field.)

Also do percent encoding of special characters just to make sure they are not the cause for the mismatches.

q=strfield:/description%3A(.*?)company(.*?)%3B.*/

Update: Just tried it on a string field. The above regex works. It works even without the percent encoding too i.e.

q=strfield:/description:.*?company.*?;.*/


标签: regex solr