PGSQL - Joining two tables on complicated conditio

I got stuck during database migration on PostgreSQL and need your help.

I have two tables that I need to join: drzewa_mateczne.migracja (data I need to migrate) and ibl_as.t_adres_lesny (dictionary I need to join with migracja).

I need to join them on replace(drzewa_mateczne.migracja.adresy_lesne, ' ', '') = replace(ibl_as.t_adres_lesny.adres, ' ', ''). However my data is not very regular, so I want to join it on first good match with the dictionary.

I've created the following query:

select
count(*)

from
drzewa_mateczne.migracja a

where
length(a.adresy_lesne) > 0
and replace(a.adresy_lesne, ' ', '') = (select substr(replace(al.adres, ' ', ''), 1, length(replace(a.adresy_lesne, ' ', ''))) from ibl_as.t_adres_lesny al limit 1)

The query doesn't return any rows. It does successfully join empty rows if ran without

length(a.adresy_lesne) > 0

The two following queries return rows (as expected):

select replace(adres, ' ', '') 
from ibl_as.t_adres_lesny
where substr(replace(adres, ' ', ''), 1, 16) = '16-15-1-13-180-c'
limit 1


select replace(adresy_lesne, ' ', ''), length(replace(adresy_lesne, ' ', '')) 
from drzewa_mateczne.migracja
where replace(adresy_lesne, ' ', '') = '16-15-1-13-180-c'

I'm suspecting that there might be a problem in sub-query inside the 'where' clause in my query. If you guys could help me resolve this issue, or at least point me in the right direction, I'd be very greatful.

Thanks in advance, Jan

标签： postgresql join subquery data-migration

2条回答

可以哭但决不认输i

2楼-- · 2019-08-27 21:06

What you're basically telling the database to do is to get you the count of rows from drzewa_mateczne.migracja that have a non-empty adresy_lesne field that is a prefix of the adres field of a semi-random ibl_as.t_adres_lesny row...

Lose the "limit 1" in the subquery and substitute the "=" with "in" and see if that is what you wanted...

0人赞添加讨论(0) 举报

Juvenile、少年°

3楼-- · 2019-08-27 21:15

You can largely simplify to:

SELECT count(*)
FROM   drzewa_mateczne.migracja a
WHERE  a.adresy_lesne <> ''
AND    EXISTS (
   SELECT 1 FROM ibl_as.t_adres_lesny al 
   WHERE  replace(al.adres, ' ', '')
    LIKE (replace(a.adresy_lesne, ' ', '') || '%')
   )

a.adresy_lesne <> '' does the same as length(a.adresy_lesne) > 0, just faster.
Replace the correlated subquery with an EXISTS semi-join (to get only one match per row).
Replace the complex string construction with a simple LIKE expression.

More information on pattern matching and index support in these related answers:
PostgreSQL LIKE query performance variations
Difference between LIKE and ~ in Postgres
speeding up wildcard text lookups

0人赞添加讨论(0) 举报

PGSQL - Joining two tables on complicated conditio

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间