Skip to content
This repository has been archived by the owner on Jul 18, 2022. It is now read-only.

Configuring hadoop indexing

rcking edited this page Dec 21, 2011 · 3 revisions

It is assumed that a Hadoop system has been configured. Information for managing this is found here.

In order to run Map/Reduce jobs, the various Hadoop nodes must be started, i.e. run

.../hadoop-0.20.205.0/bin/start-all.sh

In order to be able to submit a job to a Hadoop cluster, the following Hadoop configuration files must be copied in to the play/conf directory:

core-site.xml

hdfs-site.xml

mapred-site.xml

It is also necessary to compile org.backmeup.index.Indexer as a jar file so that the Hadoop index job submission works properly. To this end, an ant build file has been included at the top level of the project, along with a properties file. At this time, the only important property is the java home directory, for example:

java.home.dir = /usr/lib/jvm/java-6-sun

To build the jar, execute

ant clean

ant compile

ant jar

Then add the path of the resulting indexbatch.jar file to the hdfs.properties file like this:

index.jar.path = .../backmeup-prototype/dist/indexbatch.jar