Tuesday, 10 September 2013

Mahout minhash org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.Text

Mahout minhash org.apache.hadoop.io.LongWritable cannot be cast to
org.apache.hadoop.io.Text

I am using :
hadoop-1.2.1 and mahout-distribution-0.8
When I try to run HASHMIN method with following command:
$MAHOUT_HOME/bin/mahout org.apache.mahout.clustering.minhash.MinHashDriver
-i tce-data/cv.vec -o tce-data/out/cv/minHashDriver/ -ow
I get this error:
tce@osy-Inspiron-N5110:~$ $MAHOUT_HOME/bin/mahout
org.apache.mahout.clustering.minhash.MinHashDriver -i tce-data/cv.vec
-o tce-data/out/cv/minHashDriver/ -ow
Warning: $HADOOP_HOME is deprecated.
Running on hadoop, using /home/tce/app/hadoop-1.2.1/bin/hadoop and
HADOOP_CONF_DIR=
MAHOUT-JOB: /home/tce/app/mahout-distribution-0.8/mahout-examples-0.8-job.jar
Warning: $HADOOP_HOME is deprecated.
13/09/10 18:17:46 WARN driver.MahoutDriver: No
org.apache.mahout.clustering.minhash.MinHashDriver.props found on
classpath, will use command-line arguments only
13/09/10 18:17:46 INFO common.AbstractJob: Command line arguments:
{--endPhase=[2147483647], --hashType=[MURMUR], --input=[tce-data/cv.vec],
--keyGroups=[2], --minClusterSize=[10], --minVectorSize=[5],
--numHashFunctions=[10], --numReducers=[2],
--output=[tce-data/out/cv/minHashDriver/], --overwrite=null,
--startPhase=[0], --tempDir=[temp], --vectorDimensionToHash=[value]}
13/09/10 18:17:48 INFO input.FileInputFormat: Total input paths to process
: 1
13/09/10 18:17:50 INFO mapred.JobClient: Running job: job_201309101645_0031
13/09/10 18:17:51 INFO mapred.JobClient: map 0% reduce 0%
13/09/10 18:18:27 INFO mapred.JobClient: Task Id :
attempt_201309101645_0031_m_000000_0, Status : FAILED
java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be
cast to org.apache.hadoop.io.Text
at
org.apache.mahout.clustering.minhash.MinHashMapper.map(MinHashMapper.java:30)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
I appreciate any idea

No comments:

Post a Comment