Is it possible to use Apache mahout without any dependency to Hadoop.
I would like to use the mahout algorithm on a single computer by only including the mahout library inside my Java project but i dont want to use hadoop at all since i will be running on a single node anyway.
Is that possible?
Yes. Not all of Mahout depends on Hadoop, though much does. If you use a piece that depends on Hadoop, of course, you need Hadoop. But for example there is a substantial recommender engine code base that does not use Hadoop.
You can embed a local Hadoop cluster/worker in a Java program.
Definitely, yes. In the Mahout Recommender First-Timer FAQ they advise against starting out with a Hadoop-based implementation (unless you know you're going to be scaling past 100 million user preferences relatively quickly).
You can use the implementations of the Recommender interface in a pure-Java fashion relatively easily. Or place one in the servlet of your choice.
Technically, Mahout has a Maven dependency on Hadoop. But you can use recommenders without the Hadoop JARs easily. This is described in the first few chapters of Mahout in Action - you can download the sample source code and see how it's done - look at the file RecommenderIntro.java
.
However, if you're using Maven, you would need to exclude Hadoop manually - the dependency would look like this:
<dependency>
<groupId>org.apache.mahout</groupId>
<artifactId>mahout-core</artifactId>
<exclusions>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
</exclusion>
</exclusions>
</dependency>