340 likes | 1.04k Views
Apache Mesos. http://incubator.apache.org/mesos @ ApacheMesos. Benjamin Hindman – @ benh. history. Berkeley research project including Benjamin Hindman , Andy Konwinski , Matei Zaharia , Ali Ghodsi , Anthony D. Joseph, Randy Katz, Scott Shenker , Ion Stoica
E N D
Apache Mesos http://incubator.apache.org/mesos @ApacheMesos Benjamin Hindman – @benh
history • Berkeley research project including Benjamin Hindman, Andy Konwinski, MateiZaharia, Ali Ghodsi, Anthony D. Joseph, Randy Katz, Scott Shenker, Ion Stoica • http://incubator.apache.org/mesos/research.html
Mesos aims to make it easier to build distributed applications/frameworks and share cluster resources
applications/frameworks services analytics
applications/frameworks services analytics
how? Hadoop service Hadoop service … Mesos … Node Node Node Node Node Node Node Node
level of abstraction • more easily share the resources via multi-tenancy and elasticity (improve utilization) • run on bare-metal or virtual machines – develop against Mesos API, run in private datacenter (Twitter), or the cloud, or both!
static partitioning sharing with Mesos Hadoop Spark shared cluster service
features • APIs in C++, Java, Python • high-availability via zookeeper • isolation via linuxcontrol groups (LXC)
in progress • official apache release • more linuxcgroup support (oom and i/o, in particular, networking) • resource usage monitoring, reporting • new allocators (priority based, usage based) • new frameworks (storm) • scheduler management (launching, watching, re-launching, etc)
400+ nodes running production services • genomics researchers using Hadoop and Spark • Spark in use by Yahoo! Research • Spark for analytics • Hadoop and Spark used by machine learning researchers Your Name Here
linux environment • $ yum install -ygcc-c++ • $ yum install -y java-1.6.0-openjdk-devel.x86_64 • $ yum install -ymake.x86_64 • $ yum install -ypatch.x86_64 • $ yum install -y python26-devel.x86_64 • $ yum install -yant.noarch
get mesos • $ wget http://people.apache.org/~benh/mesos-0.9.0-incubating-RC3/mesos-0.9.0-incubating.tar.gz • $ tar zxvf mesos-0.9.0-incubating.tar.gz • $ cd mesos-0.9.0
build mesos • $ mkdirbuild • $ cdbuild • $ ../configure.amazon-linux-64 • $ make • $ make install
deploy mesos (1) • /usr/local/var/mesos/deploy/masters: • ec2-50-17-28-135.compute-1.amazonaws.com • /usr/local/var/mesos/deploy/slaves: • ec2-184-73-142-43.compute-1.amazonaws.com • ec2-107-22-145-31.compute-1.amazonaws.com
deploy mesos (2) • on slaves (i.e., ec2-184-73-142-43.compute-1.amazonaws.com, ec2-107-22-145-31.compute-1.amazonaws.com) • /usr/local/var/mesos/conf/mesos.conf: • master=ec2-50-17-28-135.compute-1.amazonaws.com
deploy mesos (3) • $ /usr/local/sbin/mesos-start-cluster.sh
build hadoop • $ make hadoop • $ mv hadoop/hadoop-0.20.205.0 /etc/hadoop • $ cp protobuf-2.4.1.jar /etc/hadoop • $ cp src/mesos-0.9.0.jar /etc/hadoop
configure hadoop (1) • conf/mapred-site.xml: • <configuration> • <property> • <name>mapred.job.tracker</name> • <value>ip-10-108-207-105.ec2.internal:9001</value> • </property> • <property> • <name>mapred.jobtracker.taskScheduler</name> • <value>org.apache.hadoop.mapred.MesosScheduler</value> • </property> • <property> • <name>mapred.mesos.master</name> • <value>ip-10-108-207-105.ec2.internal:5050</value> • </property> • </configuration>
configure hadoop (2) • conf/hadoop-env.sh: • #!/bin/sh • export JAVA_HOME=/usr/lib/jvm/jre • # Google protobuf (necessary for running the MesosScheduler). • export PROTOBUF_JAR=${HADOOP_HOME}/protobuf-2.4.1.jar • # Mesos. • export MESOS_JAR=${HADOOP_HOME}/mesos-0.9.0.jar • # Native Mesos library. export MESOS_NATIVE_LIBRARY=/usr/local/lib/libmesos.so • export HADOOP_CLASSPATH=${HADOOP_HOME}/build/contrib/mesos/classes:${MESOS_JAR}:${PROTOBUF_JAR} • ...
configure hadoop (3) • conf/core-site.sh: • <configuration> • <property> • <name>fs.default.name</name> • <value>hdfs://ip-10-108-207-105.ec2.internal:9000</value> • </property> • </configuration>
configure hadoop (4) • conf/masters: • ec2-50-17-28-135.compute-1.amazonaws.com • conf/slaves: • ec2-184-73-142-43.compute-1.amazonaws.com • ec2-107-22-145-31.compute-1.amazonaws.com
configure hadoop (4) • conf/masters: • ec2-50-17-28-135.compute-1.amazonaws.com • conf/slaves: • ec2-184-73-142-43.compute-1.amazonaws.com • ec2-107-22-145-31.compute-1.amazonaws.com
starting hadoop • $ pwd • /etc/hadoop • $ ./bin/hadoopjobtracker
running wordcount • $ ./bin/hadoopjar hadoop-examples-0.20.205.0.jar wordcountmacbeth.txt output
starting another hadoop • <configuration> • <property> • <name>mapred.job.tracker</name> • <value>ip-10-108-207-105.ec2.internal:9002</value> • </property> • <property> • <name>mapred.job.tracker.http.address</name> • <value>0.0.0.0:50032</value> • </property> • <property> • <name>mapred.task.tracker.http.address</name> • <value>0.0.0.0:50062</value> • </property> • </configuration>
get and build spark • $ gitclone git://github.com/mesos/spark.git • $ cdspark • $ gitcheckout --track origin/mesos-0.9 • $ sbt/sbtcompile
configure spark • $ cp conf/spark-env.sh.template conf/spark-env.sh • conf/spark-env.sh: • #!/bin/sh • export SCALA_HOME=/root/scala-2.9.1-1 • export MESOS_NATIVE_LIBRARY=/usr/local/lib/libmesos.so • export SPARK_MEM=1g
run spark shell • $ pwd • /root/spark • $ MASTER=$HOSTNAME:5050 ./spark-shell
setting log_dir • on slaves (i.e., ec2-184-73-142-43.compute-1.amazonaws.com, ec2-107-22-145-31.compute-1.amazonaws.com) • /usr/local/var/mesos/conf/mesos.conf: • master=ec2-50-17-28-135.compute-1.amazonaws.com • log_dir=/tmp/mesos
re-deploy mesos • $ /usr/local/sbin/mesos-stop-slaves.sh • $ /usr/local/sbin/mesos-start-slaves.sh