Sort string as number in sql server

2020-02-13 13:43发布

问题:

I have a column that contains data like this. dashes indicate multi copies of the same invoice and these have to be sorted in ascending order

790711
790109-1
790109-11
790109-2

i have to sort it in increasing order by this number but since this is a varchar field it sorts in alphabetical order like this

790109-1
790109-11
790109-2
790711

in order to fix this i tried replacing the -(dash) with empty and then casting it as a number and then sorting on that

select cast(replace(invoiceid,'-','') as decimal) as invoiceSort...............order by invoiceSort asc

while this is better and sorts like this

            invoiceSort
790711      (790711)   <-----this is wrong now as it should come later than 790109
790109-1    (7901091)
790109-2    (7901092)
790109-11   (79010911)

Someone suggested to me to split invoice id on the - (dash ) and order by on the 2 split parts

like=====> order by split1 asc,split2 asc (790109,1)

which would work i think but how would i split the column.

The various split functions on the internet are those that return a table while in this case i would be requiring a scalar function.

Are there any other approaches that can be used? The data is shown in grid view and grid view doesn't support sorting on 2 columns by default ( i can implement it though :) ) so if any simpler approaches are there i would be very nice.

EDIT : thanks for all the answers. While every answer is correct i have chosen the answer which allowed me to incorporate these columns in the GridView Sorting with minimum re factoring of the sql queries.

回答1:

Judicious use of REVERSE, CHARINDEX, and SUBSTRING, can get us what we want. I have used hopefully-explanatory columns names in my code below to illustrate what's going on.

Set up sample data:

DECLARE @Invoice TABLE (
    InvoiceNumber nvarchar(10)
);

INSERT @Invoice VALUES
('790711')
,('790709-1')
,('790709-11')
,('790709-21')
,('790709-212')
,('790709-2')

SELECT * FROM @Invoice

Sample data:

InvoiceNumber
-------------
790711
790709-1
790709-11
790709-21
790709-212
790709-2

And here's the code. I have a nagging feeling the final expressions could be simplified.

SELECT 
    InvoiceNumber
    ,REVERSE(InvoiceNumber) 
        AS Reversed
    ,CHARINDEX('-',REVERSE(InvoiceNumber)) 
        AS HyphenIndexWithinReversed
    ,SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber)) 
        AS ReversedWithoutAffix
    ,SUBSTRING(InvoiceNumber,1+LEN(SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber))),LEN(InvoiceNumber)) 
        AS AffixIncludingHyphen
    ,SUBSTRING(InvoiceNumber,2+LEN(SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber))),LEN(InvoiceNumber)) 
        AS AffixExcludingHyphen
    ,CAST(
        SUBSTRING(InvoiceNumber,2+LEN(SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber))),LEN(InvoiceNumber))
        AS int)  
        AS AffixAsInt
    ,REVERSE(SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber))) 
        AS WithoutAffix
FROM @Invoice
ORDER BY
    -- WithoutAffix
    REVERSE(SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber))) 
    -- AffixAsInt
    ,CAST(
        SUBSTRING(InvoiceNumber,2+LEN(SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber))),LEN(InvoiceNumber))
        AS int)

Output:

InvoiceNumber Reversed   HyphenIndexWithinReversed ReversedWithoutAffix AffixIncludingHyphen AffixExcludingHyphen AffixAsInt  WithoutAffix
------------- ---------- ------------------------- -------------------- -------------------- -------------------- ----------- ------------
790709-1      1-907097   2                         907097               -1                   1                    1           790709
790709-2      2-907097   2                         907097               -2                   2                    2           790709
790709-11     11-907097  3                         907097               -11                  11                   11          790709
790709-21     12-907097  3                         907097               -21                  21                   21          790709
790709-212    212-907097 4                         907097               -212                 212                  212         790709
790711        117097     0                         117097                                                         0           790711

Note that all you actually need is the ORDER BY clause, the rest is just to show my working, which goes like this:

  • Reverse the string, find the hyphen, get the substring after the hyphen, reverse that part: This is the number without any affix
  • The length of (the number without any affix) tells us how many characters to drop from the start in order to get the affix including the hyphen. Drop an additional character to get just the numeric part, and convert this to int. Fortunately we get a break from SQL Server in that this conversion gives zero for an empty string.
  • Finally, having got these two pieces, we simple ORDER BY (the number without any affix) and then by (the numeric value of the affix). This is the final order we seek.

The code would be more concise if SQL Server allowed us to say SUBSTRING(value, start) to get the string starting at that point, but it doesn't, so we have to say SUBSTRING(value, start, LEN(value)) a lot.



回答2:

Try this one -

Query:

DECLARE @Invoice TABLE (InvoiceNumber VARCHAR(10))
INSERT @Invoice 
VALUES
      ('790711')
    , ('790709-1')
    , ('790709-21')
    , ('790709-11')
    , ('790709-211')
    , ('790709-2')

;WITH cte AS 
(
    SELECT 
          InvoiceNumber
        , lenght = LEN(InvoiceNumber)
        , delimeter = CHARINDEX('-', InvoiceNumber)
    FROM @Invoice
)
SELECT InvoiceNumber
FROM cte
CROSS JOIN (
    SELECT repl = MAX(lenght - delimeter)
    FROM cte
    WHERE delimeter != 0
) mx
ORDER BY 
      SUBSTRING(InvoiceNumber, 1, ISNULL(NULLIF(delimeter - 1, -1), lenght))
    , RIGHT(REPLICATE('0', repl) + SUBSTRING(InvoiceNumber, delimeter + 1, lenght), repl)

Output:

InvoiceNumber
-------------
790709-1
790709-2
790709-11
790709-21
790709-211
790711


回答3:

Try this

SELECT invoiceid FROM Invoice
ORDER BY 
CASE WHEN PatIndex('%[-]%',invoiceid) > 0
      THEN LEFT(invoiceid,PatIndex('%[-]%',invoiceid)-1)
      ELSE invoiceid END * 1
,CASE WHEN PatIndex('%[-]%',REVERSE(invoiceid)) > 0
      THEN RIGHT(invoiceid,PatIndex('%[-]%',REVERSE(invoiceid))-1)
      ELSE NULL END * 1

SQLFiddle Demo

Above query uses two case statements

  1. Sorts first part of Invoiceid 790109-1 (eg: 790709)
  2. Sorts second part of Invoiceid after splitting with '-' 790109-1 (eg: 1)

For detailed understanding check the below SQLfiddle

SQLFiddle Detailed Demo

OR use 'CHARINDEX'

SELECT invoiceid FROM Invoice
ORDER BY 
CASE WHEN CHARINDEX('-', invoiceid) > 0
      THEN LEFT(invoiceid, CHARINDEX('-', invoiceid)-1)
      ELSE invoiceid END * 1
,CASE WHEN CHARINDEX('-', REVERSE(invoiceid)) > 0
      THEN RIGHT(invoiceid, CHARINDEX('-', REVERSE(invoiceid))-1)
      ELSE NULL END * 1


回答4:

Order by each part separately is the simplest and reliable way to go, why look for other approaches? Take a look at this simple query.

select *
from Invoice
order by Convert(int, SUBSTRING(invoiceid, 0, CHARINDEX('-',invoiceid+'-'))) asc,
         Convert(int, SUBSTRING(invoiceid, CHARINDEX('-',invoiceid)+1, LEN(invoiceid)-CHARINDEX('-',invoiceid))) asc


回答5:

Plenty of good answers here, but I think this one might be the most compact order by clause that is effective:

SELECT *
FROM Invoice
ORDER BY LEFT(InvoiceId,CHARINDEX('-',InvoiceId+'-'))
         ,CAST(RIGHT(InvoiceId,CHARINDEX('-',REVERSE(InvoiceId)+'-'))AS INT)DESC

Demo: - SQL Fiddle

Note, I added the '790709' version to my test, since some of the methods listed here aren't treating the no-suffix version as lesser than the with-suffix versions.

If your invoiceID varies in length, before the '-' that is, then you'd need:

SELECT *
FROM Invoice
ORDER BY CAST(LEFT(list,CHARINDEX('-',list+'-')-1)AS INT)
         ,CAST(RIGHT(InvoiceId,CHARINDEX('-',REVERSE(InvoiceId)+'-'))AS INT)DESC

Demo with varying lengths before the dash: SQL Fiddle



回答6:

My version:

declare @Len int
select @Len = (select max (len (invoiceid) -  charindex ( '-', invoiceid))-1 from MyTable)

select 
invoiceid ,
cast (SUBSTRING (invoiceid ,1,charindex ( '-', invoiceid )-1) as int) * POWER (10,@Len) + 
cast (right(invoiceid, len (invoiceid) -  charindex ( '-', invoiceid)  ) as int )
from MyTable

You can implement this as a new column to your table:

ALTER TABLE MyTable ADD COLUMN invoice_numeric_id int null
GO

declare @Len int
select @Len = (select max (len (invoiceid) -  charindex ( '-', invoiceid))-1 from MyTable)


UPDATE TABLE MyTable
SET  invoice_numeric_id = cast (SUBSTRING (invoiceid ,1,charindex ( '-', invoiceid )-1) as int) * POWER (10,@Len) + 
    cast (right(invoiceid, len (invoiceid) -  charindex ( '-', invoiceid)  ) as int )


回答7:

One way is to split InvoiceId into its parts, and then sort on the parts. Here I use a derived table, but it could be done with a CTE or a temporary table as well.

select InvoiceId, InvoiceId1, InvoiceId2
from
(
    select
    InvoiceId,
    substring(InvoiceId, 0, charindex('-', InvoiceId, 0)) as InvoiceId1,
    substring(InvoiceId, charindex('-', InvoiceId, 0)+1, len(InvoiceId)) as InvoiceId2
    FROM Invoice
) tmp
order by
cast((case when len(InvoiceId1) > 0 then InvoiceId1 else InvoiceId2 end) as int),
cast((case when len(InvoiceId1) > 0 then InvoiceId2 else '0' end) as int)

In the above, InvoiceId1 and InvoiceId2 are the component parts of InvoiceId. The outer select includes the parts, but only for demonstration purposes - you do not need to do this in your select.

The derived table (the inner select) grabs the InvoiceId as well as the component parts. The way it works is this:

  • When there is a dash in InvoiceId, InvoiceId1 will contain the first part of the number and InvoiceId2 will contain the second.
  • When there is not a dash, InvoiceId1 will be empty and InvoiceId2 will contain the entire number.

The second case above (no dash) is not optimal because ideally InvoiceId1 would contain the number and InvoiceId2 would be empty. To make the inner select work optimally would decrease the readability of the select. I chose the non-optimal, more readable, approach since it is good enough to allow for sorting.

This is why the ORDER BY clause tests for the length - it needs to handle the two cases above.

Demo at SQL Fiddle



回答8:

Break the sort into two sections:

SQL Fiddle

MS SQL Server 2008 Schema Setup:

CREATE TABLE TestData
(
  data varchar(20)
)

INSERT TestData
SELECT '790711' as data
UNION
    SELECT '790109-1'
UNION
    SELECT '790109-11'
UNION 
    SELECT '790109-2'

Query 1:

SELECT *
FROM TestData
ORDER BY 
    FLOOR(CAST(REPLACE(data, '-', '.') AS FLOAT)),
    CASE WHEN CHARINDEX('-', data) > 0 
        THEN CAST(RIGHT(data, len(data) - CHARINDEX('-', data)) AS INT)
        ELSE 0 
    END

Results:

|      DATA |
-------------
|  790109-1 |
|  790109-2 |
| 790109-11 |
|    790711 |


回答9:

Try:

select invoiceid  ... order by Convert(decimal(18, 2), REPLACE(invoiceid, '-', '.'))