solr sanitizing query

2019-01-23 09:43发布

问题:

I am using solr with ruby on rails. It's all working well, I just need to know if there's any existing code to sanitize user input, like a query starting with ? or *

回答1:

I don't know any code that does this, but theoretically it could be done by looking at the parsing code in Lucene and searching for throw new ParseException (only 16 matches!).

In practice, I think you're better off just catching any solr exceptions in your code and showing an "invalid query" message or something like that.

EDIT: Here are a couple of "sanitizers":

  • http://pivotallabs.com/users/zach/blog/articles/937-sanitizing-solr-requests
  • http://github.com/jvoorhis/lucene_query
  • http://e-mats.org/2010/01/escaping-characters-in-a-solr-query-solr-url/


回答2:

The Solr Security and the Solr Query Syntax wiki pages may be relevant.



回答3:

If you are using Solarium with PHP then you can use the Solarium_Escape::term() method.

/**
 * Escape a term
 *
 * A term is a single word.
 * All characters that have a special meaning in a Solr query are escaped.
 *
 * If you want to use the input as a phrase please use the {@link phrase()}
 * method, because a phrase requires much less escaping.\
 *
 * @link http://lucene.apache.org/java/docs/queryparsersyntax.html#Escaping%20Special%20Characters
 *
 * @param string $input
 * @return string
 */
static public function term($input)
{
    $pattern = '/(\+|-|&&|\|\||!|\(|\)|\{|}|\[|]|\^|"|~|\*|\?|:|\\\)/';

    return preg_replace($pattern, '\\\$1', $input);
}