I am trying to index data across multiple tables using Solr's Data Import Handler. The official wiki on the DIH suggests using embedded entities to link multiple tables like so:
<document>
<entity name="item" pk="id" query="SELECT * FROM item">
<entity name="member" pk="memberid" query="SELECT * FROM member WHERE memberid='${item.memberid}'>
</entity>
</entity>
</document>
Another way that works is:
<document>
<entity name="item" pk="id" query="SELECT * FROM item INNER JOIN member ON item.memberid=member.memberid">
</entity>
</document>
Are these two methods functionally different? Is there a performance difference? My guess would be that the first method is to support non-SQL tables, but I'm not sure.
Another though would be that, if using join tables in MySQL, using the SQL query method with multiple joins could cause multiple documents to be indexed instead of one.