Is there a more efficient / elegant way to write t

2019-09-19 01:56发布

问题:

I'm wondering if anybody can help me out with any or all of this code below. I've made it work, but it seems inefficient to me and is probably quite a bit slower than optimal.

Some basic background on the necessity of this code in the first place:

I have a table of shipping records that does not include the corresponding invoice number. I've looked all through the tables and I continue to do so. In fact, only this morning I discovered that if a packing slip has been generated that I can link the shipping table to the packing slip table via that packing slip ID and grab the invoice number from there. Absent that link, however, I'm forced to guess. In most instances, that's not terribly difficult, because the invoice table has number, line and release that can match up. But when there are multiple shipments for number, line and release (for instance, when a line is partially shipped) then there can be multiple answers, only one of which is correct. I am partially helped by the presence of a a column in the shipping table that states what the date sequence is for that number, line and release, but there are still circumstances where the process I use for "guessing" can be somewhat ambiguous.

What my procedure does is this. First, it creates a table of data that includes the invoice number if there was a pack slip to link it through.

Next, it dumps all of that data into a second table, this time using--only if the invoice was NULL in the first table--a "guess" about the invoice number based on partitioning all the shipping records by number, line, release, date sequence and date, and then comparing that to the same type of thing for the invoice table, and trying to line everything up by date.

Finally, it parses through that table and finds any last nulls and essentially matches them up with the first record of any invoice for that number, line and release.

Both guesses have added characters to show that they are, in fact, guesses.

IF OBJECT_ID('tempdb..#cosTAble') IS NOT NULL
DROP TABLE #cosTable

DECLARE @cosTable2  TABLE (
     ID             INT IDENTITY
    ,co_num         CoNumType
    ,co_line        CoLineType
    ,co_release     CoReleaseType
    ,date_seq       DateSeqType
    ,ship_date      DateType
    ,inv_num        NVARCHAR(14)
    )

DECLARE
     @co_num_ck     CoNumType
    ,@co_line_ck    CoLineType
    ,@co_release_ck CoReleaseType

DECLARE @Counter1   INT = 0

SELECT cos.co_num, cos.co_line, cos.co_release, cos.date_seq, cos.ship_date, cos.qty_invoiced, pck.inv_num
    INTO #cosTable
    FROM co_ship cos
        LEFT JOIN pckitem pck
            ON cos.pack_num = pck.pack_num
            AND cos.co_num = pck.co_num
            AND cos.co_line = pck.co_line
            AND cos.co_release = pck.co_release

;WITH cos_Order
    AS(
        SELECT co_num, co_line, co_release, qty_invoiced, date_seq, ship_date, ROW_NUMBER () OVER (PARTITION BY co_num, co_line, co_release ORDER BY ship_date) AS cosrow
        FROM co_ship
        WHERE qty_invoiced > 0
    ),

invi_Order
    AS(
        SELECT inv_num, co_num, co_line, co_release, ROW_NUMBER () OVER (PARTITION BY co_num, co_line, co_release ORDER BY RecordDate) AS invirow
        FROM inv_item
        WHERE qty_invoiced > 0
    ),

cos_invi
    AS(
        SELECT cosO.*, inviO.inv_num
        FROM cos_Order cosO
            LEFT JOIN invi_Order inviO
                ON cosO.co_num = inviO.co_num AND cosO.co_line = inviO.co_line AND cosO.cosrow = inviO.invirow)

INSERT INTO @cosTable2
SELECT cosT.co_num, cosT.co_line, cosT.co_release, cosT.date_seq, cosT.ship_date, COALESCE(cosT.inv_num,'*'+cosi.inv_num) AS inv_num
    FROM #cosTable cosT
    LEFT JOIN cos_invi cosi
        ON  cosT.co_num = cosi.co_num
        AND cosT.co_line = cosi.co_line
        AND cosT.co_release = cosi.co_release
        AND cosT.date_seq = cosi.date_seq
        AND cosT.ship_date = cosi.ship_date



WHILE @Counter1 < (SELECT MAX(ID) FROM @cosTable2) BEGIN
    SET @Counter1 += 1
    SET @co_num_ck = (SELECT co_num FROM @cosTable2 WHERE ID = @Counter1)
    SET @co_line_ck = (SELECT co_line FROM @cosTable2 WHERE ID = @Counter1)
    SET @co_release_ck = (SELECT co_release FROM @cosTable2 WHERE ID = @Counter1)
    IF EXISTS (SELECT * FROM @cosTable2 WHERE ID = @Counter1 AND inv_num IS NULL)
        UPDATE @cosTable2
            SET inv_num = '^' + (SELECT TOP 1 inv_num FROM @cosTable2 WHERE 
                            @co_num_ck = co_num AND
                            @co_line_ck = co_line AND
                            @co_release_ck = co_release)
            WHERE ID = @Counter1 AND inv_num IS NULL
    END

SELECT * FROM @cosTable2
ORDER BY co_num, co_line, co_release, date_seq, ship_date

回答1:

You're in a bad spot - as @craig.white and @HLGEM suggest, you've inherited something without sufficient constraints to make the data correct or safe...and now you have to "synthesize" it. I get that guesses are the best you can do, and you can, at least make your guesses reasonable performance-wise.

After that, you should squeal loudly to get some time to fix the db - to apply the constraints needed to prevent further crapification of the data.

Performance-wise, the while loop is a disaster. You'd be better off replacing that whole mess with a single update statement...something like:

update c0
set inv_nbr = '^' + c1.inv_nbr
from
  @cosTable2 c0
  left outer join
  (
    select
      co_num,
      co_line,
      co_release,
      inv_nbr
    from
      @cosTable2
    where
      inv_nbr is not null
    group by
      co_num,
      co_line,
      co_release,
      inv_nbr        
  ) as c1
  on
    c0.co_num = c1.co_num and
    c0.co_line = c1.co_line and
    c0.co_release = c1.co_release
where
  c0.inv_num is null

...which does the same thing the loop does, only in a single statement.



回答2:

It seems to me that you are trying very hard to solve a problem that should not exist. What you describe is an unfortunately common situation where a process has grown organically without intent and specific direction as a business has grown which has made data extraction near impossible to automate. You very much need a set of policies and procedures- For (very crude and simple) example: 1: An Order must exist before a packing slip can be generated. 2: a packing slip must exist before an invoice can be generated.
3: an invoice is created using data from the packing slip and order (what was requested, what was picked, what do we bill) -Again, this is a crude example just to illustrate the idea. All of the data MUST be entered at the proper time or someone has not done their job. It is not in the IT departments typical skillset to accurately and consistently provide management good data when such data does not exist.



标签: tsql