I am developing an application in opencl whose basic objective is to implement a data mining algorithm on GPU platform. I want to use Hadoop Distributed File System and want to execute the application on multiple nodes. I am using MapReduce framework and I have divided my basic algorithm into two parts i.e. 'Map' and 'Reduce'.
I have never worked in hadoop before so I have some questions:
- Do I have write my application in java only to use Hadoop and Mapeduce framework?
- I have written kernel functions for map and reduce in opencl. Is it possible to use HDFS a file system for a non java GPU-Computing application? (Note: I don't want to use JavaCL or Aparapi)
HDFS is a file system; you can use HDFS file system with any language.
HDFS data is distributed over multiple machines, it is highly available to process the data in GPU computing.
For more information reference Hadoop Streaming.
You could use Hadoop Streaming, with it you can write mappers and reducers in any language you want as long as your code can read from the stdio and write back to it. For inspiration you can take at examples of how R is used with Hadoop Streaming