How to maintain Sitecore Lucene indexes in huge co

2020-02-13 08:14发布

问题:

I know that Lucene indexes cannot be shared:

Sitecore keeps a local copy of the Lucene index in the file system of each instance and does not support sharing the indexes between instances.

Is it possible to xcopy Lucene indexes between CM and CD?

Is there some other approach or recommendation to maintain a indexes in 30+ content delivery servers ?

Update: I’m fully aware that CDs must kick off their own index update. With over 30 CD servers I’m thinking that maybe there will be a period of time that not all CD server have the same set of indexes. I’m afraid that for some reason the indexes will fail in some of the CD servers and tracking why/where will be hell. That’s why trying to discover if there some alternative approach where indexes are maintain in one place (some sort shared) and basically instantly replicated to all CD

回答1:

You need to enable History Engine for the CM and CD servers web database.

See this extract from the Sitecore Scaling Guide.

To enable History Engine for a Sitecore database: In the web.config file, add the following section to the /configuration/sitecore/databases/database element, where id equals the name of the database:

<Engines.HistoryEngine.Storage>
  <obj type="Sitecore.Data.$(database).$(database)HistoryStorage, Sitecore.Kernel">
   <param connectionStringName="$(id)" />
   <EntryLifeTime>30.00:00:00</EntryLifeTime>
  </obj>
</Engines.HistoryEngine.Storage>
<Engines.HistoryEngine.SaveDotNetCallStack>false</Engines.HistoryEngine.SaveDotNetCallStack>

When a Sitecore item is changed, the Lucene indexes are updated immediately on the Sitecore instance where the change was made. On remote servers in a multi-server environment, the Lucene indexes are not updated immediately after an item is changed. The Lucene indexes are automatically updated after the interval that is defined in the web.config file, in the Indexing.UpdateInterval setting and with the minimum wait time between the two consequent updates defined in the Indexing.UpdateJobThrottle setting.

See here



回答2:

You can also consider using the open source Sitecore Lucene Refresher that will run a index crawl operation in-memory and will commit the index back to the file system so you don't lose any index content during the rebuild process. This can at least help. Then maybe, set up some sort of agent to run this crawl/rebuild operation at a specific time of day to keep all CD servers doing this at the same time in-sync.



回答3:

Wesley Lomax answer is correct. However, I want to point out that I was also involved in the same situation where I have items in my Data Folder in 1000's. I updated my web.config setting like this:

 <!--  INDEX FOLDER
        The path to the folder where the Lucene.Net search indexes are stored.
        Default value: $(dataFolder)/indexes
  -->
  <setting name="IndexFolder" value="$(dataFolder)/indexes" />
  <!--  INDEX UPDATE INTERVAL
        Gets the interval between the IndexingManager checking its queue for pending actions.
        Default value: "00:01:00" (1 minute)
  -->
  <setting name="Indexing.UpdateInterval" value="00:00:30" />
  <!--  INDEX UPDATE JOB THROTTLE
        Gets the minimum time to wait between individual index update jobs.
        Default value: "00:00:01" (1 second)
  -->
  <setting name="Indexing.UpdateJobThrottle" value="00:00:01" />


回答4:

It should probably be pointed out that sitecore now recommends that you use Solr in this scenario and not to try and synchonise multiple Lucene indexes:

The general reasons for using Solr instead of Lucene are...

If you use multiple content delivery servers (or plan to do so later), use Solr. Solr works automatically in such an environment. You could use Lucene, but you have to make sure that indexes are synchronized across servers yourself.

You should therefore use Solr if you plan to scale your site (have a distributed setup with multiple servers).

From Using Solr or Lucene