Longest Prefix Match

2019-07-26 23:03发布

What's the best way to get accurate and fast queries in PostgreSQL for a longest prefix match?

Is it:

A.) select * from table where column in (subselect) ;

B.) select * from table where strpos(column,column2) = 1
    order by length(column2) desc limit 1 ;

C.) select * from table where column ~ column2
    order by length(column2) desc limit 1

I'm planning to use in an update. Any ideas?

1条回答
Animai°情兽
2楼-- · 2019-07-26 23:15

I wouldn't know of a function doing this out of the box in PostgreSQL.
A recursive CTE would be the key element for a rather elegant solution (available in PostgreSQL 8.4 or later).

I am assuming a table filter to hold the filter strings:

CREATE TABLE filter (f_id int, string text);

And a table tbl to be search for the longest match:

CREATE TABLE tbl(t_id int, col text);

Query

WITH RECURSIVE
     f AS (SELECT f_id, string, length(string) AS flen FROM filter)
    ,t AS (SELECT t_id, col, length(col) AS tlen FROM tbl)
    ,x AS (
    SELECT t.t_id, f.f_id, t.col, f.string
          ,2 AS match, LEAST(flen, tlen) AS len
    FROM   t
    JOIN   f ON left(t.col, 1) = left(f.string, 1)

    UNION ALL
    SELECT t_id, f_id, col, string, match + 1, len
    FROM   x
    WHERE  left(col, match) = left(string, match)
    AND    match <= len
    )
SELECT DISTINCT
       f_id
      ,string
      ,first_value(col) OVER w AS col
      ,first_value(t_id) OVER w AS t_id
      ,(first_value(match) OVER w -1) AS longest_match
FROM   x
WINDOW w AS (PARTITION BY f_id ORDER BY match DESC)
ORDER  BY 2,1,3,4;

Detailed explanation how the final SELECT works in this related answer.
Working demo on sqlfiddle.

You did not define which match to pick from a set of equally long matches. I am picking one arbitrary winner from ties.

I'm planning to use in an update.

PostgreSQL 9.1 introduced data modifying CTEs, so you can use this in an UPDATE statement directly.

查看更多
登录 后发表回答