I'm trying to perform a SQLite FTS query with untrusted user input. I do not want to give the user access to the query syntax, that is they will not be able to perform a match query like foo OR bar AND cats
. If they tried to query with that string I would want to interpret it as something more like foo \OR bar \AND cats
.
There doesn't seem to be anything built in to SQLite for this, so I'll probably end up building my own escaping function, but this seems dangerous and error-prone. Is there a preferred way to do this?
OK I've investigated further, and with some heavy magic you can access the actual tokenizer used by SQLite's FTS. The "simple" tokenizer takes your string, separates it on any character that is not in [A-Za-z0-0], and lowercases the remaining. If you perform this same operation you will get a nicely "escaped" string suitable for FTS.
You can write your own, but you can access SQLite's internal one as well. See this question for details on that: Automatic OR queries using SQLite FTS4
The FTS MATCH syntax is its own little language. For FTS5, verbatim string literals are well defined:
Within an FTS expression a string may be specified in one of two ways:
It turns out that correctly escaping a string for an FTS query is simple enough to implement completely and reliably: Replace "
with ""
and enclose the result in "
on both ends.
In my case it then works perfectly when I put it into a prepared statement such as SELECT stuff FROM fts_table WHERE fts_table MATCH ?
. I would then .bind(fts_escape(user_input))
where fts_escape
is the function I described above.