Join SQL Server tables on a like statement

2020-07-03 06:15发布

问题:

I am hoping this isn't a repeat. I've checked the searches and I can't seem to find a clear answer to this.

I have a table that has it's primary key set to be a UniqueIdentifier. I also have another table that has a varchar column that basically contains a url with a query string that contains guids from my first table.

So my 2 tables are like:

StateTable

StateID                                  StateName
EB06F84C-15B9-4397-98AD-4A63DA2A238E     Active

URLTable

URL
page.aspx?id=EB06F84C-15B9-4397-98AD-4A63DA2A238E

What I'm trying to do is join together URLTable and StateTable ON the value of StateID is contained in URL of URL table. I haven't really figured out the join. I've even tried just selecting the one table and tried to filter by the values in StateTable. I've tried doing something like this:

SELECT  *
FROM URLTable
WHERE     EXISTS
    (SELECT  *
     FROM  StateTable
     WHERE URL LIKE '%' + StateID + '%')

Even that doesn't work because it says I'm comparing uniqueidentifier and varchar.

Is there any way to join 2 tables using a like command and where the like command isn't comparing 2 incompatible variables?

Thank you!!

UPDATE: Let me add some additional things I should have mentioned. The query is for the purposes of building analytics reports. The tables are part of a CMS analytics package... so updating or changing the table structure is not an option.

Secondly, these tables see a very high amount of traffic since they're capturing site analytics... so performance is very much an issue. The 3rd thing is that in my example, I said id= but there may be multiple values such as id=guid&user=guid&date=date.

UPDATE 2: One more thing I just realized to my horror is that sometimes the query string has the dashes removed from the GUID.. and sometimes not.. so unless I"m mistaken, I can't cast the substring to Uniqueidentifier. Can anyone confirm? sigh. I did get it to work using

REPLACE('-','',CONVERT(varchar(50), a.AutomationStateId))

but now I'm very much worried about performance issues with this since the URL's table is very large. This might be the nature of the beast, though, unless there's anything I can do.

回答1:

Cast StateID to a compatible type, e.g.

WHERE URL LIKE '%' + CONVERT(varchar(50), StateID) + '%'

or

WHERE URL LIKE N'%' + CONVERT(nvarchar(50), StateID) + N'%'

if URL is nvarchar(...)

EDIT

As pointed out in another answer, this could result in poor performance on large tables. The LIKE combined with a CONVERT will result in a table scan. This may not be a problem for small tables, but you should consider splitting the URL into two columns if performance becomes a problem. One column would contain 'page.aspx?id=' and the other the UNIQUEIDENTIFIER. Your query could then be optimized much more easily.



回答2:

Do you know that the = is always there and always is a UNIQUEIDENTIFIER. Then you can do this:

WHERE CAST(SUBSTRING(URL, CHARINDEX('=',URL)+1,LEN(URL)) AS UNIQUEIDENTIFIER)=StateID

EDIT

As part of the comment you can also so it with a JOIN. Like this:

select 
   u.* 
from 
   urltable
join statetable s 
   on CAST(SUBSTRING(URL, CHARINDEX('=',URL)+1,LEN(URL)) AS UNIQUEIDENTIFIER)=StateID


回答3:

select u.* from urltable
join statetable s on url like N'%' + (convert(varchar(50),s.stateid) + N'%'

performance is likely to be awful



回答4:

You may get a performance improvement if you build a temp table first, with the option to index the temp table. You could then also modify the schema (of your temp table) which could give you options on your join. Often when joining to BIG tables it helps to extract a subset of data to a temp table first, then join to it. Other times the overhead of the temp table is bigger than using an 'ugly' join