SQL Server Join In Order

2020-03-18 04:16发布

问题:

I have 2 string in input for example '1,5,6' and '2,89,9' with same number of element (3 or plus). Those 2 string i want made a "ordinate join" as

1   2
5   89
6   9

i have think to assign a rownumber and made a join between 2 result set as

SELECT a.item, b.item  FROM 
  (
  SELECT  
  ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS rownumber,
  *  FROM dbo.Split('1,5,6',',')
  ) AS a
  INNER JOIN   
  (
  SELECT  
  ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS rownumber,
  *  FROM dbo.Split('2,89,9',',')
 ) AS b ON a.rownumber = b.rownumber 

is that a best practice ever?

回答1:

When dbo.Split() returns the data-set, nothing you do can assign the row_number you want (based on their order in the string) with absolute certainty. SQL never guarantees an ordering without an ORDER BY that actually relates to the data.

With you trick of using (SELECT 0) to order by you may often get the right values. Probably very often. But this is never guaranteed. Once in a while you will get the wrong order.

Your best option is to recode dbo.Split() to assign a row_number as the string is parsed. Only then can you know with 100% certainty that the row_number really does correspond to the item's position in the list.

Then you join them as you suggest, and get the results you want.


Other than that, the idea does seem fine to me. Though you may wish to consider a FULL OUTER JOIN if one list can be longer than the other.



回答2:

You can do it like this as well

Consider your split function like this:

CREATE FUNCTION Split
(
  @delimited nvarchar(max),
  @delimiter nvarchar(100)
) RETURNS @t TABLE
(
  id int identity(1,1),
  val nvarchar(max)
)
AS
BEGIN
  declare @xml xml
  set @xml = N'<root><r>' + replace(@delimited,@delimiter,'</r><r>') + '</r></root>'

  insert into @t(val)
  select 
    r.value('.','varchar(5)') as item
  from @xml.nodes('//root/r') as records(r)

  RETURN
END
GO

The it will be a simple task to JOIN them together. Like this:

SELECT
    *
FROM
    dbo.Split('1,5,6',',') AS a
    JOIN dbo.Split('2,89,9',',') AS b
        ON a.id=b.id

The upside of this is that you do not need any ROW_NUMBER() OVER(ORDER BY SELECT 0)

Edit

As in the comment the performance is better with a recursive split function. So maybe something like this:

CREATE FUNCTION dbo.Split (@s varchar(512),@sep char(1))
RETURNS table
AS
RETURN (
    WITH Pieces(pn, start, stop) AS (
      SELECT 1, 1, CHARINDEX(@sep, @s)
      UNION ALL
      SELECT pn + 1, stop + 1, CHARINDEX(@sep, @s, stop + 1)
      FROM Pieces
      WHERE stop > 0
    )
    SELECT pn,
      SUBSTRING(@s, start, CASE WHEN stop > 0 THEN stop-start ELSE 512 END) AS s
    FROM Pieces
  )
GO

And then the select is like this:

SELECT
    *
FROM
    dbo.Split('1,5,6',',') AS a
    JOIN dbo.Split('2,89,9',',') AS b
        ON a.pn=b.pn


回答3:

Thanks to Arion's suggestion. It's very useful for me. I modified the function a little bit to support varchar(max) type of input string, and max length of 1000 for the delimiter string. Also, added a parameter to indicate if you need the empty string in the final return.

For MatBailie's question, because this is an inline function, you can include the pn column in you outer query which is calling this function.

CREATE FUNCTION dbo.Split (@s nvarchar(max),@sep nvarchar(1000),  @IncludeEmpty bit)
RETURNS table
AS
RETURN (
    WITH Pieces(pn, start, stop) AS (
      SELECT convert(bigint, 1) , convert(bigint, 1), convert(bigint,CHARINDEX(@sep, @s))
      UNION ALL
      SELECT pn + 1, stop + LEN(@sep), CHARINDEX(@sep, @s, stop + LEN(@sep))
      FROM Pieces
      WHERE stop > 0
    )
    SELECT pn,
      SUBSTRING(@s, start, CASE WHEN stop > 0 THEN stop-start ELSE LEN(@s) END) AS s
    FROM Pieces
    where start< CASE WHEN stop > 0 THEN stop ELSE LEN(@s) END + @IncludeEmpty
  )

But I ran into a bit issue with this function when the list intended to return had more than 100 records. So, I created another function purely using string parsing functions:

Create function [dbo].[udf_split] (
    @ListString nvarchar(max),
    @Delimiter  nvarchar(1000),
    @IncludeEmpty bit) 
Returns @ListTable TABLE (ID int, ListValue varchar(max))
AS
BEGIN
    Declare @CurrentPosition int, @NextPosition int, @Item nvarchar(max), @ID int
    Select  @ID = 1,
            @ListString = @Delimiter+ @ListString + @Delimiter,
            @CurrentPosition = 1+LEN(@Delimiter)

    Select  @NextPosition = Charindex(@Delimiter, @ListString, @CurrentPosition)
    While   @NextPosition > 0 Begin

        Select  @Item = Substring(@ListString, @CurrentPosition, @NextPosition-@CurrentPosition)
        If      @IncludeEmpty=1 or Len(LTrim(RTrim(@Item)))>0 Begin 
                Insert Into @ListTable (ID, ListValue) Values (@ID, LTrim(RTrim(@Item)))
                Set @ID = @ID+1
        End
        Select  @CurrentPosition = @NextPosition+LEN(@Delimiter), 
                @NextPosition = Charindex(@Delimiter, @ListString, @CurrentPosition)
    End
    RETURN
END

Hope this could help.