SOLR delta-import timestamp issue

2019-04-14 04:11发布

问题:

I'm new to SOLR and was doing some research on this technology. I now have a question regarding the delta-import function so I looked on SO and found this: Solr DataImportHandler delta import. In the answer there is a field [date_update] mentioned which seems to be a timestamp of the record.

My question is: Is [date_update] a timestamp stored in the table on record creation? If so, cannot this create an issues if the date of the Database Server is not exactly in sync with the server on which SOLR is installed? This could possible leave out some records if the Solr server time is ahead of SQLServer time.

回答1:

This solution might left some records behind (if servers are not configured properly).

I'm using similar solution but with some modifications. Items in DB have timestamp field updated when item changes in any way.

Before updating index I'm getting last timestamp from Solr (this field is stored), then I'm passing this timestamp in index query to Solr (/?command=full-import&clean=false&timestamp=...).

Using query attribute for both full and delta import

That way time on Solr machine have nothing to do with the time on DB machine. However in my case, after indexing is completed I'm performing quick verification with DB (check is anything missing for some reason, or something have to be deleted).

You can also use that kind of verification when you use dataimporter.last_index_time.



回答2:

You could use FlexCDC, which monitors the MySQL binary log for table data changes:

http://www.mysqlperformanceblog.com/2011/03/25/using-flexviews-part-two-change-data-capture/