How to include custom data migrations and static/r

2019-04-24 01:30发布

问题:

We have a moderately-sized SSDT project (~100 tables) that's deployed to dozens of different database instances. As part of our build process we generate a .dacpac file and then when we're ready to upgrade a database we generate a publish script and run it against the database. Some db instances are upgraded at different times so it's important that we have a structured process for these upgrades and versioning.

Most of the generated migration script is dropping and (re)creating procs, functions, indexes and performing any structural changes, plus some data scripts included in a Post-Deployment script. It's these two data-related items I'd like to know how best to structure within the project:

  1. Custom data migrations needed between versions

  2. Static or reference data

Custom data migrations needed between versions

Sometimes we want to perform a one-off data migration as part of an upgrade and I'm not sure the best way to incorporate this into our SSDT project. For example, recently I added a new bit column dbo.Charge.HasComments to contain (redundant) derived data based on another table and will be kept in sync via triggers. An annoying but necessary performance improvement (only added after careful consideration & measurement). As part of the upgrade the SSDT-generated Publish script will contain the necessary ALTER TABLE and CREATE TRIGGER statements, but I also want to update this column based on data in another table:

update dbo.Charge 
set HasComments = 1 
where exists ( select * 
               from dbo.ChargeComment 
               where ChargeComment.ChargeId = Charge.ChargeId ) 
and HasComments = 0 

What's the best way to include this data migration script in my SSDT project?

Currently I have each of these types of migrations in a separate file that's included in the Post-Deployment script, so my Post-Deployment script ends up looking like this:

-- data migrations
:r "data migration\Update dbo.Charge.HasComments if never populated.sql"
go
:r "data migration\Update some other new table or column.sql"
go

Is this the right way to do it, or is there some way to tie in with SSDT and its version tracking better, so those scripts aren't even run when the SSDT Publish is being run against a database that's already at a more recent version. I could have my own table for tracking which migrations have been run, but would prefer not to roll-my-own if there's a standard way of doing this stuff.

Static or reference data

Some of the database tables contain what we call static or reference data, e.g. list of possible timezones, setting types, currencies, various 'type' tables etc. Currently we populate these by having a separate script for each table that is run as part of the Post-Deployment script. Each static data script inserts all the 'correct' static data into a table variable and then inserts/updates/deletes the static data table as needed. Depending on the table it might be appropriate only to insert or only insert and delete but not to update existing records. So each script looks something like this:

-- table listing all the correct static data
declare @working_data table (...)

-- add all the static data that should exist into the working table
insert into @working_data (...) select null, null null where 1=0
union all select 'row1 col1 value', 'col2 value', etc...
union all select 'row2 col1 value', 'col2 value', etc...
...

-- insert any missing records in the live table
insert into staticDataTableX (...)
select * from @working_data
where not exists ( select * from staticDataTableX
                   where [... primary key join on @working_data...] )

-- update any columns that should be updated
update staticDataTableX
set ...
from staticDataTableX
inner join @working_data on [... primary key join on @working_data...]

-- delete any records, if appropriate with this sort of static data
delete from staticDataTableX
where not exists ( select * from staticDataTableX
                   where [... primary key join on @working_data...] )

and then my Post-Deployment script has a section like this:

-- static data. each script adds any missing static/reference data:
:r "static_data\settings.sql"
go
:r "static_data\other_static_data.sql"
go
:r "static_data\more_static_data.sql"
go

Is there a better or more conventional way to structure such static data scripts as part of an SSDT project?

回答1:

To track whether or not the field has already been initialized, try adding an Extended Property when the initialize is performed (it can also be used to determine the need for the initialize):

To add the extended property:

EXEC sys.sp_addextendedproperty 
@name = N'EP_Charge_HasComments', 
@value = N'Initialized', 
@level0type = N'SCHEMA', @level0name = dbo, 
@level1type = N'TABLE',  @level1name = Charge,
@level2type = N'COLUMN', @level2name = HasComments;

To check for the extended property:

SELECT objtype, objname, name, value
FROM fn_listextendedproperty (NULL, 
    'SCHEMA', 'dbo', 
    'TABLE', 'Charge', 
    'COLUMN', 'HasComments');

For reference data, try using a MERGE. It's MUCH cleaner than the triple-set of queries you're using.

MERGE INTO staticDataTableX AS Target
USING (
VALUES  
        ('row1_UniqueID', 'row1_col1_value', 'col2_value'),
        ('row2_UniqueID', 'row2_col1_value', 'col2_value'),
        ('row3_UniqueID', 'row3_col1_value', 'col2_value'),
        ('row4_UniqueID', 'row4_col1_value', 'col2_value')


    ) AS Source (TableXID,  col1, col2)
        ON Target.TableXID = Source.TableXID
WHEN MATCHED THEN
   UPDATE SET 
        Target.col1 = Source.col1,  
        Target.col2 = Source.col2  

WHEN NOT MATCHED BY TARGET THEN
   INSERT (TableXID,  col1, col2)
   VALUES (Source.TableXID,  Source.col1, Source.col2)

WHEN NOT MATCHED BY SOURCE THEN
    DELETE;