可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I have the below select statement. I am using a LEFT JOIN to push two tables together. There are two conditions where the LEFT JOIN should work:
Condition 1: ATTOM_ID. ATTOM ID is the unique Identifier
Condition 2: ZIP, LAST NAME, and FULL ADDRESS. These fields are string fields and all three must match to be a JOIN.
Any other condition should result in a NULL, hence the LEFT JOIN. If either condition passes, the JOIN should occur, which is why I wanted an OR statement here.
for some reason Google Big Query does not like the query because I have an OR in it. The error I get is:
LEFT OUTER JOIN cannot be used without a condition that is an equality of fields from both sides of the join.
Here is the SQL Statement. All other issues on this statement work. Is there a GBQ limitation on using LEFT JOINS and the "OR" operator? Thanks.
SELECT
Source,
FirstName,
LastName,
MiddleName,
Gender,
Age,
DOB,
Address,
Address2,
City,
State,
Zip,
Zip4,
TimeZone,
Income,
HomeValue,
Networth,
MaritalStatus,
IsRenter,
HasChildren,
CreditRating,
Investor,
LinesOfCredit,
InvestorRealEstate,
Traveler,
Pets,
MailResponder,
Charitable,
PolicalDonations,
PoliticalParty,
coalesce(P.ATTOM_ID, T.ATTOM_ID) as ATTOM_ID,
coalesce(P.GEOID, T.GEOID) as GEOID,
Score,
Score1,
Score2,
Score3,
Score4,
Score5,
PropertyLatitude AS Latitude,
PropertyLongitude AS Longitude
FROM `db.ds.table1` P
LEFT JOIN `db.ds.table2` T
ON 1 =
CASE
WHEN (P.ATTOM_ID = T.ATTOM_ID)
THEN 1
WHEN P.Zip = T. PropertyAddressZIP
AND (
LOWER(P.LastName) = LOWER(T.DeedOwner1NameLast)
OR LOWER(P.LastName) = LOWER(T.PartyOwner1NameLast)
)
AND (
STRPOS(LOWER(P.Address), LOWER(T.PropertyAddressFull) ) > 0
OR STRPOS(LOWER(T.PropertyAddressFull), LOWER(P.Address) ) > 0
)
AND IFNULL(T.PropertyAddressFull,'') != ''
THEN 1
ELSE 0 END
回答1:
Maybe a different approach? Can you split the OR
in your JOIN
condition, create two INNER JOIN
queries and UNION
them together?
SELECT
Source,
FirstName,
LastName,
MiddleName,
Gender,
Age,
DOB,
Address,
Address2,
City,
State,
Zip,
Zip4,
TimeZone,
Income,
HomeValue,
Networth,
MaritalStatus,
IsRenter,
HasChildren,
CreditRating,
Investor,
LinesOfCredit,
InvestorRealEstate,
Traveler,
Pets,
MailResponder,
Charitable,
PolicalDonations,
PoliticalParty,
coalesce(P.ATTOM_ID, T.ATTOM_ID) as ATTOM_ID,
coalesce(P.GEOID, T.GEOID) as GEOID,
Score,
Score1,
Score2,
Score3,
Score4,
Score5,
PropertyLatitude AS Latitude,
PropertyLongitude AS Longitude
FROM `db.ds.Table1` P
INNER JOIN `db.ds.Table2` T ON (P.ATTOM_ID = T.ATTOM_ID)
UNION
SELECT
Source,
FirstName,
LastName,
MiddleName,
Gender,
Age,
DOB,
Address,
Address2,
City,
State,
Zip,
Zip4,
TimeZone,
Income,
HomeValue,
Networth,
MaritalStatus,
IsRenter,
HasChildren,
CreditRating,
Investor,
LinesOfCredit,
InvestorRealEstate,
Traveler,
Pets,
MailResponder,
Charitable,
PolicalDonations,
PoliticalParty,
coalesce(P.ATTOM_ID, T.ATTOM_ID) as ATTOM_ID,
coalesce(P.GEOID, T.GEOID) as GEOID,
Score,
Score1,
Score2,
Score3,
Score4,
Score5,
PropertyLatitude AS Latitude,
PropertyLongitude AS Longitude
FROM `db.ds.Table1` P
INNER JOIN `db.ds.Table2` T ON P.Zip = T. PropertyAddressZIP
AND (
LOWER(P.LastName) = LOWER(T.DeedOwner1NameLast)
OR LOWER(P.LastName) = LOWER(T.PartyOwner1NameLast)
)
AND (
STRPOS(LOWER(P.Address), LOWER(T.PropertyAddressFull) ) > 0
OR STRPOS(LOWER(T.PropertyAddressFull), LOWER(P.Address) ) > 0
)
AND IFNULL(T.PropertyAddressFull,'') != ''
回答2:
Below is simplified example of your issue
#standardSQL
WITH `db.ds.Table1` AS (
SELECT NULL id, '12345' zip, 'abc' name UNION ALL
SELECT 2, '23456', 'vwu' UNION ALL
SELECT 4 id, '12347' zip, 'abd' name
), `db.ds.Table2` AS (
SELECT 2 id, '12346' zip, 'xyz' name UNION ALL
SELECT 3, '12345' zip, 'abc' name
)
SELECT p, t FROM `db.ds.Table1` p
LEFT JOIN `db.ds.Table2` t
ON p.id = t.id OR p.zip = t.zip
it produces Error: LEFT OUTER JOIN cannot be used without a condition that is an equality of fields from both sides of the join.
You can rewrite above as
#standardSQL
WITH `db.ds.Table1` AS (
SELECT NULL id, '12345' zip, 'abc' name UNION ALL
SELECT 2, '23456', 'vwu' UNION ALL
SELECT 4 id, '12347' zip, 'abd' name
), `db.ds.Table2` AS (
SELECT 2 id, '12346' zip, 'xyz' name UNION ALL
SELECT 3, '12345' zip, 'abc' name
)
SELECT
COALESCE(p.id, t.id) AS id,
p.zip,
p.name
FROM (
SELECT ANY_VALUE(p) p , ANY_VALUE(IF(p.id = t.id OR p.zip = t.zip, t, NULL)) t
FROM `db.ds.Table1` p
CROSS JOIN `db.ds.Table2` t
GROUP BY TO_JSON_STRING(p)
)
here you are simply moving all your conditions inside IF() function and replace LEFT JOIN with CROSS JOIN
result is
Row id zip name
1 3 12345 abc
2 2 23456 vwu
3 4 12347 abd
as you can see - you are not missing id=4 from Table1
Hope, you can adopt this to your specific query (should be straight forward copy-pasting)
回答3:
Try moving the last part of the join condition to a WHERE clause:
SELECT
Source,
<lots_of_columns>
FROM `db.ds.Table1` P
LEFT JOIN `db.ds.Table2` T
ON (P.ATTOM_ID = T.ATTOM_ID)
OR (
P.Zip = T. PropertyAddressZIP
AND ( LOWER(P.LastName) = LOWER(T.DeedOwner1NameLast)
OR LOWER(P.LastName) = LOWER(T.PartyOwner1NameLast))
AND ( STRPOS(LOWER(P.Address), LOWER(T.PropertyAddressFull) ) > 0
OR STRPOS(LOWER(T.PropertyAddressFull), LOWER(P.Address) ) > 0 )
)
WHERE IFNULL(T.PropertyAddressFull,'') != '';
That may be where the issue is coming from, since it doesn't reference both tables in the join.
回答4:
Maybe try using a CASE
as the join clause?
SELECT
Source,
FirstName,
LastName,
MiddleName,
Gender,
Age,
DOB,
Address,
Address2,
City,
State,
Zip,
Zip4,
TimeZone,
Income,
HomeValue,
Networth,
MaritalStatus,
IsRenter,
HasChildren,
CreditRating,
Investor,
LinesOfCredit,
InvestorRealEstate,
Traveler,
Pets,
MailResponder,
Charitable,
PolicalDonations,
PoliticalParty,
coalesce(P.ATTOM_ID, T.ATTOM_ID) as ATTOM_ID,
coalesce(P.GEOID, T.GEOID) as GEOID,
Score,
Score1,
Score2,
Score3,
Score4,
Score5,
PropertyLatitude AS Latitude,
PropertyLongitude AS Longitude
FROM `db.ds.Table1` P
LEFT JOIN `db.ds.Table2` T ON 1 =
CASE
WHEN (P.ATTOM_ID = T.ATTOM_ID)
THEN 1
WHEN P.Zip = T. PropertyAddressZIP
AND (
LOWER(P.LastName) = LOWER(T.DeedOwner1NameLast)
OR LOWER(P.LastName) = LOWER(T.PartyOwner1NameLast)
)
AND (
STRPOS(LOWER(P.Address), LOWER(T.PropertyAddressFull) ) > 0
OR STRPOS(LOWER(T.PropertyAddressFull), LOWER(P.Address) ) > 0
)
AND IFNULL(T.PropertyAddressFull,'') != ''
THEN 1
ELSE 0 END