I'm trying to use the Duke Fast Deduplication Engine to search for some duplicate records in the database at the company where I work.
I run it from the command line like this:
java -cp "C:\utils\duke-0.6\duke-0.6.jar;C:\utils\duke-0.6\lucene-core-3.6.1.jar" no.priv.garshol.duke.Duke --showmatches --verbose .\config.xml
But I get an error:
Exception in thread "main" java.lang.UnsupportedOperationException: Operation no
t yet supported
at sun.jdbc.odbc.JdbcOdbcResultSet.isClosed(Unknown Source)
at no.priv.garshol.duke.datasources.JDBCDataSource$JDBCIterator.close(JD
at no.priv.garshol.duke.Processor.deduplicate(Processor.java:152)
at no.priv.garshol.duke.Duke.main_(Duke.java:135)
at no.priv.garshol.duke.Duke.main(Duke.java:38)
My configuration file looks like this:
<property type="id">
<param name="driver-class" value="sun.jdbc.odbc.JdbcOdbcDriver" />
<param name="connection-string" value="jdbc:odbc:VT_DeDupe" />
<param name="user-name" value="aleer" />
<param name="password" value="**" />
<param name="query" value="select SocialSecurityNumber, LastName, FirstName, MiddleName, empssn from T_Employees" />
<column name="SocialSecurityNumber" property="ID" />
<column name="LastName" property="LNAME" />
<column name="FirstName" property="FNAME" />
<column name="MiddleName" property="MNAME" />
<column name="empssn" property="SSN" />
It doesn't really tell me what is unsupported...I'm just trying it out, nothing serious with the configuration yet.