170 likes | 263 Views
Apache Bigtop Working Group. Cluster stuff. Cloud computing. Bigtop Administration. Make sure you are signed up on the bigtop-dev mailing list. Lots of info which will never get repeated if you miss it Bigtop -user, bigtop-dev. Bigtop Administration. Sign up for jira.
E N D
Apache BigtopWorking Group Cluster stuff
BigtopAdministration • Make sure you are signed up on the bigtop-dev mailing list. Lots of info which will never get repeated if you miss it • Bigtop-user, bigtop-dev
Bigtop Administration • Sign up for jira
Bigtop Administration • Registration, Join Biocurious. Pays for space nobody takes a cut of this • Free drinks • Registration = AWS Credits. Cancelling IntelliJ. Expires end of April. • rvs@apache.org
Newbie Slide • Structure: • Do labs • Lab 1 Modified to take 1-2 weeks. Update the wiki with your findings • Lab 2 Build Bigtop 0.3.0; • Can start projects here, do Jira tickets • Lab 3 map reduce program • Lab 4 Run the unit tests under the component downloads • Lab 5 Run the integration tests • Lab 6 Puppet, deploy and run • Lab 7 Port a module • Labs are changing; not a class. Requires time commitment • Demo, doesn’t need to be working; for your benefit not ours
Lab 1 • Install bigtop. Web search for apache bigtop, go to wiki link http://incubator.apache.org/bigtop/ • https://cwiki.apache.org/confluence/display/BIGTOP/Index • https://cwiki.apache.org/confluence/display/BIGTOP/How+to+install+Hadoop+distribution+from+Bigtop
Lab 1 • Install bigtop, run all the components, Hive/Hbase/Pig/Hadoop/Mahout/Oozie • There are bugs, document them • Add the sample programs in quickstart to the wiki. Not all are included yet
Lab 1 • Update the wiki • Sqoop open (User group meeting next week) • Flume/Flume NG (open/nothing) • Zookeeper(open/nothing)
Hadoop Components • Old: Don’t stop at running Pi as test of HDFS • Still missing: Run Terasort in Hadoop, need cluster • https://cwiki.apache.org/confluence/display/BIGTOP/How+to+install+Hadoop+distribution+from+Bigtop • Whirr may need patch depending on where you run it from
Mahout • Don’t run jar like in Hadoop • Scripts handle downloading and clustering, email demo, etc.. Under /examples/bin. • Bigtop puts example/bin under /usr/share/doc/mahout. Is this correct? Not documentation • Add documentation to wiki • Ticket filed
Oozie • Oozie runs, forget the error message, set to highest version
Flume/Flume NG • New patch checkinfor Flume NG • Testing
Whirr • sudo apt-get install whirr • Run as: whirr launch-cluster --config /udt/lib/whirr/recipes/mahout-ec2.properties • If successful will see directory under ~/.whirr • whirr.log • mvn clean install
Puppet • sudo apt-get install puppet facter fails
Ticket Questions/Demo • Bigtop install should include stable for ubuntu? Diff between stable and bigtop-0.3.0-incubating. There used to be a diff. • Monitoring, metrics.properties ->metrics2 • Ganglia or JMX? All components w/daemon; • Bruno has Ganglia recipes to monitor status of cluster. Hadoop monitoring: performance and functionality. Hooked up to kerberos/ commercial version is Cloudera manager. Networking, i/o, block sizes, swap space, disk space. • Stable vs. incubating? • Anwar: LogMining (M/R, clickstream and FE log data, exception on day to day basis);