Merge “pending data changes” into a view

2019-08-26 17:05发布

问题:

I have a table of pending changes, and I am trying to apply the changes to a view so I can see up-to-date data.

First some background: I have a website that allows simple changes to data in Project Web Access (PWA). An issue with PWA is that you can't change the data directly in its database, so I use SharePoint's CSOM to checkout, modify, check-in, publish and save changes to projects and tasks. This takes 10-20 seconds which is too slow for the user experience on the website.

I've worked around this by writing modified fields to a pending changes table, i.e. the identifier for the row, the field changed and the new value. I currently merge these changes in c# .Net (RBAR) for the website to display.

I would like to merge this data in SQL server, and preferably not row by row.

I'm testing in here

WITH
SourceDataView (id, name, age, joined) AS 
(
    SELECT 1, 'John' , 45, Cast('2018-01-15' as DateTime) UNION 
    SELECT 2, 'Paul', 33, Cast('2018-02-08' as DateTime) UNION 
    SELECT 3, 'George', 39, Cast('2018-03-29' as DateTime)
), 
PendingChanges (id, foreignId, fieldName, fieldValue) AS 
(
    SELECT 1, 2, 'name', 'Peter' UNION 
    SELECT 2, 1, 'age', '34' UNION 
    SELECT 3, 3, 'joined', '2018-02-22' UNION 
    SELECT 4, 1, 'joined', '2018-01-10' UNION 
    SELECT 5, 3, 'joined', '2017-12-30'
)

-- Just show the joined data
SELECT * FROM 
    SourceDataView s
        LEFT JOIN
    PendingChanges p
        on s.id = p.foreignId
ORDER BY p.id;

-- This is what I am expecting
WITH
DesiredResultsView (id, name, age, joined) AS (
    SELECT 1, 'John' , 34, Cast('2018-01-10' as DateTime) UNION 
    SELECT 2, 'Peter', 33, Cast('2018-02-08' as DateTime) UNION 
    SELECT 3, 'George', 39, Cast('2017-12-30' as DateTime)
) SELECT * from DesiredResultsView

A couple of caveats, if the data is changed twice the last edit wins. Also, the data types need to properly converted.

回答1:

Rather than trying to work with the data in the format of the SourceDataView Work with the format of PendingChanges:

To do this first UNPIVOT the source, then you will be able to merge the changes with a simple UNION. Then number the changes for each row and only keep the latest version of each field. Finally PIVOT the data back into the required shape and reapply the original data types.

WITH
SourceDataView (id, name, age, joined) AS 
(
    SELECT 1, 'John' , 45, Cast('2018-01-15' as DateTime) UNION 
    SELECT 2, 'Paul', 33, Cast('2018-02-08' as DateTime) UNION 
    SELECT 3, 'George', 39, Cast('2018-03-29' as DateTime)
), 
PendingChanges (id, foreignId, fieldName, fieldValue) AS 
(
    SELECT 1, 2, 'name', 'Peter' UNION 
    SELECT 2, 1, 'age', '34' UNION 
    SELECT 3, 3, 'joined', '2018-02-22' UNION 
    SELECT 4, 1, 'joined', '2018-01-10' UNION 
    SELECT 5, 3, 'joined', '2017-12-30'
),

SourceUnpivot (id, foreignId, fieldName, fieldValue) AS 
(
    SELECT 0 as id, id as foreignId, fieldName, fieldValue
    FROM 
       (SELECT id
            , cast(name as sql_variant) name
            , cast(age as sql_variant) age
            , cast(joined as sql_variant) joined
       FROM SourceDataView) p
    UNPIVOT
       (fieldValue FOR fieldName IN 
          (name, age, joined)
    ) AS _SourceUnpivot
),
MergeChanges as
(
    SELECT * FROM SourceUnpivot
    UNION SELECT * from PendingChanges
),
NumberedChanges as 
(
    SELECT *, 
        ROW_NUMBER() OVER(PARTITION BY foreignID, fieldName ORDER BY id DESC) as ChangeNumber
    from MergeChanges
),
MergePivot as 
(
    SELECT id,
        Cast(name as nvarchar(max)) as name, 
        Cast(age as int) as age, 
        Cast(joined as DateTime) as joined
    FROM (
        SELECT foreignId as id, fieldName, fieldValue
        from NumberedChanges
        where ChangeNumber = 1
    ) as LastChangeOnly
    PIVOT (
        MAX(fieldValue)
        FOR fieldName in (name, age, joined)
    )
    as _MergePivot
)

SELECT * from MergePivot ORDER BY id;

WITH
DesiredResultsView (id, name, age, joined) AS (
    SELECT 1, 'John' , 34, Cast('2018-01-10' as DateTime) UNION 
    SELECT 2, 'Peter', 33, Cast('2018-02-08' as DateTime) UNION 
    SELECT 3, 'George', 39, Cast('2017-12-30' as DateTime)
) SELECT * from DesiredResultsView


回答2:

I think you might find this solution to be a bit simpler and hopefully a bit better for performance as well.

CREATE TABLE SourceDataView
(
    ID INT
    ,[Name] VARCHAR(50)
    ,Age TINYINT
    ,Joined DateTime
)
;

CREATE TABLE PendingChanges
(
    ID INT
    ,ForeignID INT
    ,FieldName VARCHAR(10)
    ,FieldValue VARCHAR(100)
)
;

INSERT INTO SourceDataView
 VALUES
 (1, 'John' , 45, '2018-01-15')  
 ,(2, 'Paul', 33, '2018-02-08')
 ,(3, 'George', 39, '2018-03-29')
 ;

 INSERT INTO PendingChanges
 VALUES
(1, 2, 'name', 'Peter') 
,(2, 1, 'age', '34' )
,(3, 3, 'joined', '2018-02-22') 
,(4, 1, 'joined', '2018-01-10')
,(5, 3, 'joined', '2017-12-30')
;

WITH CTE AS
(
    SELECT *
    FROM 
    (
        SELECT
            ROW_NUMBER() OVER (PARTITION BY ForeignID,FieldName ORDER BY ID DESC) AS ChangeNumber
            ,*
        FROM PendingChanges
    ) AS RowNum
    PIVOT
    (
        MAX(FieldValue)
        FOR FieldName IN ([name],[age],[joined])
    ) AS P 
)
SELECT
    SDV.ID
    ,COALESCE(CA.[name],SDV.[name]) AS [name]
    ,COALESCE(CA.age,SDV.age) AS age
    ,COALESCE(CA.joined,SDV.joined) AS joined
FROM SourceDataView AS SDV
CROSS APPLY
(
    SELECT
        (SELECT [name] FROM CTE WHERE CTE.ForeignID = SDV.ID AND CTE.ChangeNumber = 1 AND CTE.[name] IS NOT NULL) AS [name]
        ,(SELECT age FROM CTE WHERE CTE.ForeignID = SDV.ID AND CTE.ChangeNumber = 1 AND CTE.age IS NOT NULL) AS age
        ,(SELECT joined FROM CTE WHERE CTE.ForeignID = SDV.ID AND CTE.ChangeNumber = 1 AND CTE.joined IS NOT NULL) AS joined
) AS CA


标签: tsql