120 likes | 241 Views
Labs 2: Palabras. Palabras Archiecture. …. Master1. Master2. Master3. MasterN. Slave1 Slave2 Slave3 … SlaveN. Directory. Jobs3. Jobs1. Jobs2. JobsM. …. Slave1. Slave2. Slave3. SlaveM. Step 1: Get Started. Login: Username: nombre cc5212 Password on board
E N D
PalabrasArchiecture … Master1 Master2 Master3 MasterN Slave1 Slave2 Slave3 … SlaveN Directory Jobs3 Jobs1 Jobs2 JobsM … Slave1 Slave2 Slave3 SlaveM
Step 1: Get Started • Login: • Username: nombre\cc5212 • Password on board • http://aidanhogan.com/teaching/cc5212-1/mdp-lab2.zip • C:/Program Files (x86)/eclipse/ (in Spanish ) • File > Import > … • http://aidanhogan.com/teaching/cc5212-1/mdp-lab2-data/
Step 2: Run Locally • ~600.000 abstracts • ~52.340.000 non-unique words • ~320 MB uncompressed • org.mdp.cli.RunWordCountLocally • Right Click > Run As > Run Configurations > Arguments • -i<path>/abstracts-es.txt.gz -igz –k 500 How long will it take? Will it even run? -Xmx256M
Step 3: Start the Directory • I start the directory! • vm116.dcc.uchile.cl (172.17.69.190) • Port 1985 Remind me to set heap-space
Step 4: Prepare Slave org.mdp.cli.StartWordCountSlave • Implement openDirectoryStub() • Add the slave’s name to the directory • Review the other code
Step 5: Run Slave Build the .jar using build.xml(dist) Open cmd and go to directory java –jar –Xmx256M mdp-2.jar StartWordCountSlave –dn vm116.dcc.uchile.cl –dp 1985 –sn <username>
Step 6: Prepare Master org.mdp.cli.StartWordCountMaster • Connect to the directory • Get the list of slaves from the directory • Clear words from the slave for you • Choose a slave for each word • Send the add-words job to each slave
Step 7: Run Master • For small dataset! • org.mdp.cli.StartWordCountMaster • Right Click > Run As > Run Configurations > Arguments • -i<path>\es-abstracts-10k.txt.gz -igz-dp 1985 -dn vm116.dcc.uchile.cl -mn <username> -k 500
Step 8: Run Big Master • For big dataset! • org.mdp.cli.StartWordCountMaster • Right Click > Run As > Run Configurations > Arguments • -i <path>\es-abstracts.txt.gz -igz-dp 1985 -dn vm116.dcc.uchile.cl -mn<username> -k 500
Step 9: Run Distribution Locally • Start a directory server • Build and use the jar • java -jar mdp-2.jar StartRegistryAndServer -n localhost-p 1985 -r -s 1 -sp • Start 4 slaves (give different names) in four different CMD windows • Use the jar • java -jar mdp-2.jar StartSlave -dnlocalhost-dp1985 –wn <usernameN> • Start a master • Can use Eclipse or jar (as preferred) • Point it to local directory • Use small file (large file if successful) -Xmx256M
Final Step: Teach Me Spanish Ask me words in the top 500!