I am trying to apply canopy clustering in Mahout. I already converted a text file into sequence file. But i cannot view the sequence file. Anyways I thought of applying canopy clustering by giving the following command,
hduser@ubuntu:/usr/local/mahout/trunk$ mahout canopy -i /user/Hadoop/mahout_seq/seqdata -o /user/Hadoop/clustered_data -t1 5 -t2 3
I got the following error, 16/05/10 17:02:03 INFO mapreduce.Job: Task Id : attempt_1462850486830_0008_m_000000_1, Status : FAILED Error: java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.mahout.math.VectorWritable at org.apache.mahout.clustering.canopy.CanopyMapper.map(CanopyMapper.java:29) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) I am using VMware Ubuntu. I have used a simple text file which contains a paragraph