Way to try multiple SELECTs till a result is avail

2020-01-25 02:44发布

What if I want to search for a single row in a table with a decrementing precision, e.g. like this:

SELECT * FROM image WHERE name LIKE 'text' AND group_id = 10 LIMIT 1

When this gives me no result, try this one:

SELECT * FROM image WHERE name LIKE 'text' LIMIT 1

And when this gives me no result, try this one:

SELECT * FROM image WHERE group_id = 10 LIMIT 1

Is it possible to do that with just one expression?

Also there arises a problem when I have not two but e.g. three or more search parameters. Is there a generic solution for that? Of course it would come in handy when the search result is sorted by its relevance.

4条回答
一夜七次
2楼-- · 2020-01-25 03:20

LIKE without wildcard character is equivalent to =. Assuming you actually meant name = 'text'.

Indexes are the key to performance.

Test setup

CREATE TABLE image (
  image_id serial PRIMARY KEY
, group_id int NOT NULL
, name     text NOT NULL
);

Ideally, you create two indexes (in addition to the primary key):

CREATE INDEX image_name_grp_idx ON image (name, group_id);
CREATE INDEX image_grp_idx ON image (group_id);

The second may not be necessary, depending on data distribution and other details. Explanation here:

Query

This should be the fastest possible query for your case:

SELECT * FROM image WHERE name = 'name105' AND group_id = 10
UNION ALL
SELECT * FROM image WHERE name = 'name105'
UNION ALL
SELECT * FROM image WHERE group_id = 10
LIMIT  1;

SQL Fiddle.

The LIMIT clause applies to the whole query. Postgres is smart enough not to execute later legs of the UNION ALL as soon as it has found enough rows to satisfy the LIMIT. Consequently, for a match in the first SELECT of the query, the output of EXPLAIN ANALYZE looks like this (scroll to the right!):

Limit  (cost=0.00..0.86 rows=1 width=40) (actual time=0.045..0.046 rows=1 loops=1)
  Buffers: local hit=4
  ->  Result  (cost=0.00..866.59 rows=1002 width=40) (actual time=0.042..0.042 rows=1 loops=1)
        Buffers: local hit=4
        ->  Append  (cost=0.00..866.59 rows=1002 width=40) (actual time=0.039..0.039 rows=1 loops=1)
              Buffers: local hit=4
              ->  Index Scan using image_name_grp_idx on image  (cost=0.00..3.76 rows=2 width=40) (actual time=0.035..0.035 rows=1 loops=1)
                    Index Cond: ((name = 'name105'::text) AND (group_id = 10))
                    Buffers: local hit=4
              ->  Index Scan using image_name_grp_idx on image  (cost=0.00..406.36 rows=500 width=40) (never executed)
                    Index Cond: (name = 'name105'::text)
              ->  Index Scan using image_grp_idx on image  (cost=0.00..406.36 rows=500 width=40) (never executed)
                    Index Cond: (group_id = 10)
Total runtime: 0.087 ms

Bold emphasis mine.

Do not add an ORDER BY clause, this would void the effect. Then Postgres would have to consider all rows before returning the top row.

Final questions

Is there a generic solution for that?

This is the generic solution. Add as many SELECT statements as you want.

Of course it would come in handy when the search result is sorted by its relevance.

There is only one row in the result with LIMIT 1. Kind of voids sorting.

查看更多
够拽才男人
3楼-- · 2020-01-25 03:23

It's late and I don't feel like writing out a full solution, but if I needed this I would probably create a customer function that returned a customer type, record or a table (depending on what your needs are). The advantage to this would be that once you found your record, you could stop.

Making the number of params be dynamic will make it a bit more challenging. Depending on your version of PostgreSQL (and the extension available to you), you might be able to pass in an hstore or json and dynamically build the query.

Maybe not the greatest SO answer, but it's more than a comment and hopefully some food for thought.

查看更多
forever°为你锁心
4楼-- · 2020-01-25 03:30
SELECT *, 
CASE WHEN name like 'text' AND group_id = 10 THEN 1
WHEN name like 'text' THEN 2
WHEN group_id = 10 THEN 3
ELSE 4
END ImageRank
FROM image
WHERE ImageRank <> 4
ORDER BY ImageRank ASC
LIMIT 1

This would be a pseudo-solution approach but I'm not entirely sure if the syntax in your scenario would allow for it

查看更多
Bombasti
5楼-- · 2020-01-25 03:33

I don't think there is anything wrong with running these queries separately until you find the result you want. While there are ways to combine these into one query, those end up being more complicated and slower, which isn't what you wanted.

You should run consider running all of the queries in one transaction, probably best in repeatable-read isolation level, so you get consistent results and also avoid the overhead of setting up repeated transactions. If in addition you make judicious use of prepared statements, you will have almost the same overhead as running all three queries in one combined statement.

查看更多
登录 后发表回答