What is the best way to know when an Elastic Search index needs updating, assuming the upstream data source is transactional SQL Server tables with inserts, updates and deletes?
Examples: Tables Parent, Child, Grandchild.
Parent | Child | Grandchild
ID Name | ID ParentID Name | ID ChildID Amount
1 Foo | 10 1 Bike | 100 10 5
2 Bar | 20 1 Car | 200 20 2
3 Baz | 30 3 Tran | 300 30 1
Grandchild is updated, and Elastic Search index on Parent needs to be updated for the associated record.
So, on Grandchild updated, I need to find the Parent.ID for that Grandchild. Which means joining to Child and getting the ParentID value.
At the same time, we are starting a incremental, iterative loaded data warehouse initiative, so ideally I would like to use the same SQL Server API/technique for both.
Based on comments in How to notify a windows service(c#) of a DB Table Change(sql 2005)? by Remus Rusanu, Query Notification API should not be used, as its only intended use is cache invalidation, not change tracking...
Which seems to leave two options - SQL Server Change Data Capture, and SQL Server Change Tracking API.
We thought about doing all change tracking in the application level, but our main concerns are out-of-band updates due to the fact some data needs to be updated over night in unforeseen ways due to new government regulations, and so we really need a way to capture changes at the table-level and bubble it up into a queue to feed Elastic Search.
Thank you!