I am using Match (Col1) Against (Val) in mysql.
select match(body) against(body_var) from articles;
now in case of completely match i am getting result as a number (for example 14.43).
what does this number mean? and the main question is can i get the result in percentage form (for example 0.94)
thanks for your help
There is probably a MUCH easier way to do this.. Somehow i fell down the rabbit hole on this one.. But its tested and works (returns percentage of results)
SELECT (mthCount / ttlCount) AS mPercent
FROM (
SELECT COUNT( * ) AS mthCount
FROM articles WHERE (
MATCH(body) AGAINST(body_var)
)
) AS MCount JOIN (
SELECT COUNT( * ) AS ttlCount
FROM articles
) AS TCount;
it returns one record/result with the column mPercent
You could also have it round to two decimal places...
SELECT FORMAT((mthCount / ttlCount),2) AS mPercent
FROM (
SELECT COUNT( * ) AS mthCount
FROM articles WHERE (
MATCH(body) AGAINST(body_var)
)
) AS MCount JOIN (
SELECT COUNT( * ) AS ttlCount
FROM articles
) AS TCount;
As i said.. I tested it against 358 rows with 50 matches
50/350 = 0.1396648... (for first result)
0.14 for rounded result
If you are looking to convert the relevance value to a percent for a single result - it isnt really going to happen...
The relevance value from the MATCH/AGAINST is not a good indicator of percent match.. This is covered in depth throught the internet.. Search for "Convert Relevance value to percent" ...
If you wanted to order your results by relevence match percent, with the first result always having 100% relevence, you can do that...
As for trying to get a value like PHP's similar_text - you are better to offload that work to the client...
Full-text search relevance is measured in?
http://forums.mysql.com/read.php?107,125239,146610#msg-146610
http://seminex.blogspot.com/2005/06/mysql-relevance-in-fulltext-search.html
A work around i came up with is to find the best matches, and use them to have percentage values relative to these maxes, this may not be fully useful in this case, but it sure gives you an idea.
I use this method to look for duplicates, first i insert the row and then i run this query, the best match of course is the same row.
First i have to select the best match :
SELECT
MAX(MATCH (table.col1) AGAINST ('text 1' IN NATURAL LANGUAGE MODE)) AS bscore_col1 ,
MAX(MATCH (table.col2) AGAINST ('text 2'
IN NATURAL LANGUAGE MODE)) AS bscore_col2
FROM table
ORDER BY bscore_name col1 DESC, bscore_col2 DESC) AS bests
You can of course add multiple columns, but you must create the appropriate full text search indexes first.
The full query takes the result from the first query and use them as references, you can change the ratio of 0.5, 0.5 means that the score obtained must be >50% of the best score, if you want to get all the results, remove the comparison expression.
SELECT *,
MATCH (table.col1) AGAINST ('text 1' IN NATURAL LANGUAGE MODE)/bests.bscore_col1 AS score_col1 ,
MATCH (table.col2) AGAINST ('text 2' IN NATURAL LANGUAGE MODE)/bests.bscore_col2 AS score_col2
FROM (table,
(SELECT
MAX(MATCH (table.col1) AGAINST ('text 1' IN NATURAL LANGUAGE MODE)) AS bscore_col1 ,
MAX(MATCH (table.col2) AGAINST ('text 2' IN NATURAL LANGUAGE MODE)) AS bscore_col2
FROM table
ORDER BY bscore_col2 DESC, bscore_col1 DESC) AS bests)
WHERE
MATCH (table.col1) AGAINST ('text 1' IN NATURAL LANGUAGE MODE)/bests.bscore_col1 > 0.5 AND
MATCH (table.col2) AGAINST ('text 2'IN NATURAL LANGUAGE MODE)/bests.bscore_col2 > 0.5
ORDER BY score_col2 DESC, score_col1 DESC
I don't believe this is the best solution, but it does well in my case.