For a long time, we've wanted to create a case management system where no history is ever lost. When a change is made, we want to record that change, but have the ability to go back to any point in time and see what the record looked like. I wanted to pose this question to the Stack Overflow community to see what are some ways of doing this, is there technology already in place to achieve this?
问题:
回答1:
Yes, that technology definitely exists - it's a bit of an effort to implement it and do so correctly.
What you're looking for is called temporal databases - see some resources:
- Temporal Database on Wikipedia
- Database Design: A Point in Time Architecture
- Temporal Database as a Ph.D. thesis
- StackOverflow post: why do we need temporal databases?
回答2:
I'm not sure how a temporal database like marc_s mentioned works, but if you're using SQL Server 2008 or later, you can take advantage of its built-in Change Data Capture (CDC) functionality:
- Change Data Capture (MSDN)
- Basics of Change Data Capture (MSDN)
- Using Change Data Capture
- Sql Server 2008: Change Data Capture (CDC)
Enabling CDC uses the replication transaction log to store the inserts, updates, and deletes for a table and creates table-valued functions which allow you to retrieve the rows as of a given date/time, or to retrieve just the changes.
You can't rely on CDC alone, though, because your transaction log will become unmanageably big and slow. So what you do is:
- enable CDC,
- create a history table using the same schema as the original table, but adding a couple more columns for storing row version information (much like a slowly-changing dimension in a relational OLAP database), and
- create a job which will periodically polls the CDC functions for changes since its last load and pushes them to the history table
Then you can then use the history table in your queries, joining to it as you normally would, but with an additional predicate(s) to get the record "as-of" whatever date you want.