160 likes | 284 Views
-by Rewati Ovalekar. Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid. Step 1: Code is available on: http://code.google.com/p/cyberaide/
E N D
-by Rewati Ovalekar Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid
Step 1: • Code is available on: http://code.google.com/p/cyberaide/ • Download the code from: http://code.google.com/p/cyberaide/source/browse/#svn%2Ftrunk%2Fproject%2Fspring2011%2FEEMDAnalysis%2FEEMDJava
Step 2: • Create a futuregrid account • For further details refer: https://portal.futuregrid.org/tutorials (FutureGrid Tutorial)
Step 3: • Login to Futuregrid • ssh username@india.futuregrid.org • Following message will be displayed for successful login
Step 4: • Create a jar file • Step 5: • To transfer the jar file and the input file: • sftp username@india.futuregrid.org • put /../filepath
Step 6: • In order to run Hadoop on FutureGrid create an eucalyptus account • For further details refer: https://portal.futuregrid.org/tutorials/eucalyptus • Step 7: • Once the account is approved, load the eucalyptus tools : Module load euca2ools
Step 8: • Make sure that the jar file and the input file are in the same directory as the username.private key • Run the image which has hadoop on it: euca-run-instances -k rovaleka -t c1.xlarge emi-D778156D -k indicates the key name -t indicates the type of instance emi-D778156D indicates the image name -n indicates the number of clusters to run
Step 8: • Check the status using: • euca-describe-instances • Keep checking till the status is running, once the status is running one can login to run the Hadoop. It will be displayed as below:
Step 9: • Transfer the input file and the jar file to the required VM using: scp –i username.private filename root@149.165.146.171:/ (Make sure that the address is same as the address assigned to you else it will ask for password) • Login using: scp –i username.private root@149.165.146.171 (Make sure the address is same)
Step 10: • Above message will be displayed for successful login • Retrieve the transferred files and transfer it in the Hadoop folder: cd /.. mv filename /opt/hadoop-0.20.2 cd /opt/hadoop-0.20.2 SINGLE NODE
Step 11: • To run Hadoop: cd /opt/hadoop-0.20.2 bin/start-all.sh • To check if everything is started: jps
Step 12: • Transfer the input file on the HDFS: bin/hadoop dfs –copyFromLocal inputfile name_in_HDFS • To check if it is present on HDFS: bin/hadoop dfs –ls NOTE: We need to transfer the input file whenever we start Hadoop
Step 13: • To run the code: bin/hadoop jar [jarFile] EEMDHadoop [inputfilename] [required_output_file]
Step 14: • Retrieve the output : bin/hadoop dfs -copyToLocal [outputFileName] [outputfileNameToBeGiven] (output will be avaliable in part-00000 file) To check the logs and to debug the code go to folder logs/userlogs
Step 15: • Stop the Hadoop: bin/stop-all.sh exit