Components not indexed in sitecore lucene search i

2019-05-18 18:49发布

问题:

I have configured lucene search index in configuration & tested index with lukeall tool it searches for all fields of defined templates but content on pages are using another external component, which is not searched but data in fields of page are searchable. is there any way to search it something like html search so that all data on page could be indexed.

Thanks guys.

回答1:

It's a common requirement.

This screencast outlines an approach where the crawler loops through each of the page's components (at about 38 minutes in).

http://www.techphoria414.com/Blog/2012/May/Sitecore_Page_Editor_Unleashed

The above example uses the old Advanced Database Crawler, but the principle is sound.

Another common approach is to create a computed field in your index which causes the application to request to the page, so it's HTML can be scraped.

https://github.com/hermanussen/sitecore-html-crawler

My preference is the second option because it's more accurate



回答2:

Or, if you want your crawled content completely separated you could go for https://github.com/efocus-nl/sitecorewebsearch

It also offers you some extra options like skipping parts of the page (eg the menu, footer, header)