Solr DataImportHandler not indexing all records

2019-07-07 05:24发布

问题:

When I run a full-import it is only indexing 1 document. In the logs I see it processing most of the records (~300 records). I don't see any errors in the logs. Why won't this index all of the results from the query?

Here is my data-config.xml

  <?xml version="1.0" encoding="UTF-8" ?>
     <dataConfig>
    <dataSource type="JdbcDataSource"
        driver="oracle.jdbc.driver.OracleDriver"
        url="URL"
        user="USER"
        password=PASSWORD"
        name="ds1" />
    <dataSource type="JdbcDataSource"
        driver="oracle.jdbc.driver.OracleDriver"
        url="URL"
        user="USER"
        password="PASSWORD"
        name="ds2" />
    <document name="content">
        <entity name="schema" dataSource="ds2" query="select VALUE from app_system_parameters where key = 'atg.current.catalog.schema' and expiration_date is null"> 
            <entity name="apps" dataSource="ds1" query="select CS_APPS_ID, package_name, market_url, price, min_os, supported_form_factor from ${schema.VALUE}.cs_apps">
                <entity name="nonSupportedProducts" dataSource="ds1" query="select product_id from cs_product_not_supported where cs_apps_id = '${apps.CS_APPS_ID}'"/>
                <entity name="rating" dataSource="ds1" query="select avg_overall_rating from cs_rating_summary where product_id = '${apps.CS_APPS_ID}'"/>
                <entity name="product" dataSource="ds1" query="select PARENT_CAT_ID, display_name, description, long_description from ${schema.VALUE}.dcs_product where product_id = '${apps.CS_APPS_ID}'">
                    <entity name="category" dataSource="ds1" query="select display_name as category_name from ${schema.VALUE}.dcs_category where category_id = '${product.PARENT_CAT_ID}'"/>
                </entity>
            </entity>
        </entity>
    </document>
</dataConfig>

schema snippet

<field name="VALUE" type="string" indexed="true" stored="true"/>
<field name="CS_APPS_ID" type="string" indexed="true" stored="true" required="true"/>
<field name="package_name" type="text" indexed="true" stored="true"/> 
<field name="display_name" type="text" indexed="true" stored="true"/>
<field name="market_url" type="text" indexed="true" stored="true"/>
<field name="category_name" type="text" indexed="true" stored="true"/>
<field name="avg_overall_rating" type="tdouble" indexed="true" stored="true"/>
<field name="description" type="text" indexed="true" stored="true"/>
<field name="long_description" type="text" indexed="true" stored="true"/>
<field name="price" type="text" indexed="true" stored="true"/>
<field name="min_os" type="text" indexed="true" stored="true"/>
<field name="supported_form_factor" type="text" indexed="true" stored="true"/>
<field name="product_id" type="text" indexed="true" stored="true"/>

<uniqueKey>CS_APPS_ID</uniqueKey>

<defaultSearchField>display_name</defaultSearchField>

here is the result from the full-import

<response>

<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1</int>
</lst>

<lst name="initArgs">

<lst name="defaults">
<str name="config">C:\solr/conf/data-config.xml</str>
</lst>
</lst>
<str name="command">status</str>
<str name="status">idle</str>
<str name="importResponse"/>

<lst name="statusMessages">
<str name="Total Requests made to DataSource">2634</str>
<str name="Total Rows Fetched">1335</str>
<str name="Total Documents Skipped">0</str>
<str name="Full Dump Started">2011-08-02 19:35:21</str>

<str name="">
Indexing completed. Added/Updated: 1 documents. Deleted 0 documents.
</str>
<str name="Committed">2011-08-02 19:42:36</str>
<str name="Optimized">2011-08-02 19:42:36</str>
<str name="Total Documents Processed">1</str>
<str name="Time taken ">0:7:14.131</str>
</lst>

<str name="WARNING">
This response format is experimental.  It is likely to change in the future.
</str>
</response>

Here are the end of the logs after all of the query output:

  Aug 2, 2011 7:42:36 PM org.apache.solr.handler.dataimport.DocBuilder finish
INFO: Import completed successfully
Aug 2, 2011 7:42:36 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start commit(optimize=true,waitFlush=false,waitSearcher=true,expungeDelete
s=false)
Aug 2, 2011 7:42:36 PM org.apache.solr.update.SolrIndexWriter close
FINE: Closing Writer DirectUpdateHandler2
Aug 2, 2011 7:42:36 PM org.apache.solr.core.SolrDeletionPolicy onCommit
INFO: SolrDeletionPolicy.onCommit: commits:num=2
        commit{dir=C:\solr\data\index,segFN=segments_2,version=1312332478694,gen
eration=2,filenames=[_0.tis, _0.nrm, _0.fnm, _0.tii, _0.frq, segments_2, _0.fdx,
 _0.prx, _0.fdt]
        commit{dir=C:\solr\data\index,segFN=segments_3,version=1312332478697,gen
eration=3,filenames=[_1.prx, _1.fdx, _1.tis, _1.frq, _1.fdt, _1.tii, _1.fnm, _1.
nrm, segments_3]
Aug 2, 2011 7:42:36 PM org.apache.solr.core.SolrDeletionPolicy updateCommits
INFO: newest commit = 1312332478697
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher <init>
INFO: Opening Searcher@48164feb main
Aug 2, 2011 7:42:36 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: end_commit_flush
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming Searcher@48164feb main from Searcher@1f9fd541 main
        fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,siz
e=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00
,cumulative_inserts=0,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming result for Searcher@48164feb main
        fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,siz
e=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00
,cumulative_inserts=0,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming Searcher@48164feb main from Searcher@1f9fd541 main
        filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,
warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cum
ulative_inserts=0,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming result for Searcher@48164feb main
        filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,
warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cum
ulative_inserts=0,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming Searcher@48164feb main from Searcher@1f9fd541 main
        queryResultCache{lookups=1,hits=0,hitratio=0.00,inserts=1,evictions=0,si
ze=1,warmupTime=0,cumulative_lookups=1,cumulative_hits=0,cumulative_hitratio=0.0
0,cumulative_inserts=1,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming result for Searcher@48164feb main
        queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,si
ze=0,warmupTime=0,cumulative_lookups=1,cumulative_hits=0,cumulative_hitratio=0.0
0,cumulative_inserts=1,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming Searcher@48164feb main from Searcher@1f9fd541 main
        documentCache{lookups=2,hits=1,hitratio=0.50,inserts=1,evictions=0,size=
1,warmupTime=0,cumulative_lookups=2,cumulative_hits=1,cumulative_hitratio=0.50,c
umulative_inserts=1,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming result for Searcher@48164feb main
        documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=
0,warmupTime=0,cumulative_lookups=2,cumulative_hits=1,cumulative_hitratio=0.50,c
umulative_inserts=1,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener sending requests to Searcher@48164feb main
Aug 2, 2011 7:42:36 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener done.
Aug 2, 2011 7:42:36 PM org.apache.solr.core.SolrCore registerSearcher
INFO: [] Registered new searcher Searcher@48164feb main
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher close
INFO: Closing Searcher@1f9fd541 main
        fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,siz
e=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00
,cumulative_inserts=0,cumulative_evictions=0}
        filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,
warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cum
ulative_inserts=0,cumulative_evictions=0}
        queryResultCache{lookups=1,hits=0,hitratio=0.00,inserts=1,evictions=0,si
ze=1,warmupTime=0,cumulative_lookups=1,cumulative_hits=0,cumulative_hitratio=0.0
0,cumulative_inserts=1,cumulative_evictions=0}
        documentCache{lookups=2,hits=1,hitratio=0.50,inserts=1,evictions=0,size=
1,warmupTime=0,cumulative_lookups=2,cumulative_hits=1,cumulative_hitratio=0.50,c
umulative_inserts=1,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.handler.dataimport.SolrWriter readIndexer
Properties
INFO: Read dataimport.properties
Aug 2, 2011 7:42:36 PM org.apache.solr.handler.dataimport.SolrWriter persist
INFO: Wrote last indexed time to dataimport.properties
Aug 2, 2011 7:42:36 PM org.apache.solr.update.processor.LogUpdateProcessor finis
h
INFO: {deleteByQuery=*:*,add=[prod27350148],optimize=} 0 1
Aug 2, 2011 7:42:36 PM org.apache.solr.handler.dataimport.DocBuilder execute
INFO: Time taken = 0:7:14.131

回答1:

To answer my own question, I needed to add a flag (rootEntity="false") to the root entity element. This is because that query pulls a property to inject into the nested entities but isn't tied to the results of the nested entities.



标签: solr