I want to use multiple datasources in DataImportha

2020-06-05 01:18发布

I want to use multiple datasources in DataImporthandler in Solr and pass URL value in child entity after querying database in parent entity. Here is my rss-data-config file:

<dataConfig>
    <dataSource type="JdbcDataSource" name="ds-db" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/HCDACoreDB" 
                            user="root" password="CDA@318"/>
    <dataSource type="URLDataSource" name="ds-url"/>
    <document>
        <entity name="feeds" query="select f.feedurl, f.feedsource, c.categoryname from feeds f, category c where f.feedcategory = c.categoryid">

        <field column="feedurl" name="url" dataSource="ds-db"/>
        <field column="categoryname" name="category" dataSource="ds-db"/>

        <field column="feedsource" name="source" dataSource="ds-db"/>

        <entity name="rss"
                transformer="HTMLStripTransformer" 
                forEach="/RDF/channel | /RDF/item" 
                processor="XPathEntityProcessor" 
                url="${dataimporter.functions.encodeUrl(feeds.feedurl)}" > 

            <field column="source-link" dataSource="ds-url" xpath="/rss/channel/link" commonField="true" />
            <field column="Source-desc" dataSource="ds-url" xpath="/rss/channel/description" commonField="true" />
            <field column="title" dataSource="ds-url" xpath="/rss/channel/item/title" />
            <field column="link" dataSource="ds-url" xpath="/rss/channel/item/link" />
            <field column="description" dataSource="ds-url" xpath="/rss/channel/item/description" stripHTML="true"/>
            <field column="pubDate" dataSource="ds-url" xpath="/rss/channel/item/pubDate" />
            <field column="guid" dataSource="ds-url" xpath="/rss/channel/item/guid" />
            <field column="content" dataSource="ds-url" xpath="/rss/channel/item/content" />
            <field column="author" dataSource="ds-url" xpath="/rss/channel/item/creator" />
        </entity>

    </entity>
</document>

What I am doings is in first entity named feeds I am querying database and want to use the feedurl as the URL for the child entity names rss.

The error I get when I run the dataimport is: java.net.MalformedURLException: no protocol: nullselect f.feedurl, f.feedsource, c.categoryname from feeds f, category c where f .feedcategory = c.categoryid

the URL us NULL meaning its not assigning the feedurl to URL.

Any suggestion on what I am doing wrong?

标签: solr
1条回答
唯我独甜
2楼-- · 2020-06-05 01:54

Here's an example:

<?xml version="1.0" encoding="UTF-8"?>
<dataConfig>
    <dataSource name="db1" ... />
    <dataSource name="db2"... />
    <document>
        <entity name="outer" dataSource="db1" query=" ... ">
            <field column="id" />
            <entity name="inner" dataSource="db2" query=" select from ... where id = ${outer.id} ">
                <field column="innercolumn" splitBy=":::" />
            </entity>
        </entity>
    </document>

the idea is to have one definition of the entity nested that does the extra query to the other database.

you can access the parent entity fields like this ${outer.id}

查看更多
登录 后发表回答