I have keywords like "some-or-other" where the hyphens matter in the search through my mysql database. I'm currently using the fulltext function.
Is there a way to escape the hyphen character?
I know that one option is to comment out #define HYPHEN_IS_DELIM
in the myisam/ftdefs.h file, but unfortunately my host does not allow this. Is there another option out there?
Edit 3-8-11 Here's the code I have right now:
$search_input = $_GET['search_input'];
$keyword_safe = mysql_real_escape_string($search_input);
$keyword_safe_fix = "*'\"" . $keyword_safe . "\"'*";
$sql = "
SELECT *,
MATCH(coln1, coln2, coln3) AGAINST('$keyword_safe_fix') AS score
FROM table_name
WHERE MATCH(coln1, coln2, coln3) AGAINST('$keyword_safe_fix')
ORDER BY score DESC
";
Some people would suggest to use the following query:
But by that you need many variants depending on the used fulltext operators. Task: Realize a query like
+well-known +(>35-hour <39-hour) working week*
. Too complex!And do not forget the default len of
ft_min_word_len
so a search forup-to-date
returns onlydate
in your results.Trick
Because of that I prefer a trick so constructions with
HAVING
etc aren't needed at all:Instead of adding the following text to your database table:
copy the hyphen words without hypens to the end of the text inside a comment:If the users searches for
up-to-date
remove the hyphen in the sql query:MATCH(text) AGAINST('uptodate ' IN BOOLEAN MODE)
By that you're user can find
up-to-date
as one word instead of getting all results that contain onlydate
(becauseft_min_word_len
killsup
andto
).Of course before you
echo
the texts you should remove the<!-- ... -->
comments.Advantages
-well-known +science
MySQL treats that asnot include *well*, could include *known* and must include *science*
. This isn't what the user expected. The trick solves that, too (as the sql query searches for-wellknown +science
)Maybe simpler to use the
Binary
operator.http://dev.mysql.com/doc/refman/5.0/en/cast-functions.html#operator_binary
The
BINARY
operator casts the string following it to a binary string. This is an easy way to force a column comparison to be done byte by byte rather than character by character. This causes the comparison to be case sensitive even if the column is not defined asBINARY
orBLOB
.BINARY
also causes trailing spaces to be significant.This might sound off, but after struggling with this for a while, I realised I get the results I wish for by removing the hyphen from the search expression. For instance, if I search for 'word-separated'
returns instances of 'word-separated' as needed. This also returns other instances of separated and word, but adding the
+
operator to each word achieves the hyphen search.From here http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html
One solution to find a word with a dashes or hyphens in is to use FULL TEXT SEARCH IN BOOLEAN MODE, and to enclose the word with the hyphen / dash in double quotes.
Or from here http://bugs.mysql.com/bug.php?id=2095
There is another workaround. It was recently added to the manual: " Modify a character set file: This requires no recompilation. The true_word_char() macro uses a “character type” table to distinguish letters and numbers from other characters. . You can edit the contents in one of the character set XML files to specify that '-' is a “letter.” Then use the given character set for your FULLTEXT indexes. "
Have not tried it on my own.
Edit: Here is some more additional info from here http://dev.mysql.com/doc/refman/5.0/en/fulltext-boolean.html
A phrase that is enclosed within double quote (“"”) characters matches only rows that contain the phrase literally, as it was typed. The full-text engine splits the phrase into words and performs a search in the FULLTEXT index for the words. Prior to MySQL 5.0.3, the engine then performed a substring search for the phrase in the records that were found, so the match must include nonword characters in the phrase. As of MySQL 5.0.3, nonword characters need not be matched exactly: Phrase searching requires only that matches contain exactly the same words as the phrase and in the same order. For example, "test phrase" matches "test, phrase" in MySQL 5.0.3, but not before.
If the phrase contains no words that are in the index, the result is empty. For example, if all words are either stopwords or shorter than the minimum length of indexed words, the result is empty.