600 likes | 772 Views
CSC 536 Lecture 2. Outline. Concurrency on the JVM (and between JVMs ) Working problem Java concurrency tools (review) Solution using traditional Java concurrency tools Solution using Akka concurrency tools Overview of Akka. Working problem. Users. sammy. ellie. etc. docs.
E N D
Outline • Concurrency on the JVM (and between JVMs) • Working problem • Java concurrency tools (review) • Solution using traditional Java concurrency tools • Solution using Akka concurrency tools • Overview of Akka
Working problem Users sammy ellie etc docs xyz.txt • Compute the total size of all regular files stored, • directly or indirectly, in a directory > java Sequential C:\Windows Total size: 34405975972 Time taken: 47.777222426 abc.txt foo.txt bar.txt
A recursive solution • Basis step: if input is a regular file, return its size • Recursive step: if input is a directory, call function recursively on every item in the directory, add up the returned values and return the sum • (Depth-First Traversal) • Sequential.java
Threads • A thread is a “lightweight process” • A thread really lives inside a process • A thread has its own: • program counter • stack • register set • A thread shares with other threads in the process • code • global variables
Interface Runnable • Must be implemented by any class that will be executed by a thread • Implement method run() with code the thread will run • Anonymous class example: • new Runnable() { • public void run() { // code to be run by thread } }
Class Thread • Encapsulates a thread of execution in a program • To execute a thread: • An instance of a Runnable class is passed as an argument when creating the thread • The thread is started with method start() Example: Runnabler = new Runnable() { public void run() { // code executed by thread }}; new Thread(r).start();
Producer-Consumer example • Setup • A shared memory buffer • Producer puts objects into the buffer • Consumer reads objects from the buffer • ProducerConsumerTest.java,UnsyncBuffer.java
Producer-Consumer example • Setup • A shared memory buffer • Producer puts objects into the buffer • Consumer reads objects from the buffer • ProducerConsumerTest.java,UnsyncBuffer.java • Problem: • producer can over-produce, consumer can over-consume (example of race condition) • Need to synchronize (coordinate) the processes
Synchronization • Mechanisms that ensure that concurrent threads/processes do not render shared data inconsistent • Three most widely used synchronization mechanisms in centralized systems are • Semaphores • Locks • Monitors
Monitors • Monitor = Set of operations + set of variables + lock • Set of variables is the monitor’s state • Variables can be accessed only by the monitor’s operations • At most one thread can be active within the monitor at a time • To execute a monitor’s operation, thread A must obtain the monitor’s lock • If thread B holds the monitor’s lock, thread A must wait on the monitor’s queue (wait) • Once thread A is done with the monitor’s lock, it must release it so that other threads can obtain it (notify)
Synchronization in Java • Each Java class becomes amonitor when at least one of its methods uses the synchronizedmodifier • The synchronizedmodifier is used to write code blocks and methods that require athread to obtain alock • Synchronization is always done with respect to an object ProducerConsumerTest.java, SyncBuffer.java
Java Memory model (before Java 5) • Before Java 5: ill defined • a thread not seeing values written by other threads • a thread observing impossible behaviors by other threads Java 5 and later • Monitor lock rule: a release of a lock happens before the subsequent acquire of the same lock • Volatile variable rule: a write of a volatile variable happens before every subsequent read of the same volatile variable
Disadvantages of synchronization • Disadvantages: • Synchronization is error-prone • Synchronization blocks threads and takes time • Improper synchronization results in deadlocks • Creating a thread is not a low-overhead operation • Too many threads slow down the system
Disadvantages of synchronization • Disadvantages: • Synchronization is error-prone • Synchronization blocks threads and takes time • Improper synchronization results in deadlocks • Creating a thread is not a low-overhead operation • Too many threads slow down the system
Thread pooling • Thread pooling is a solution to the thread creation and management problem • The main idea is to create a bunch of threads in advance and have them wait for something to do • The same thread can be recycled for different operations • Thread pool components: • A blocking queue • A pool of threads
Blocking queue • Queue is a sequence of objects • Two basic operations: • enqueue • dequeue • Blocking Queue: • A dequeue thread must block if the queue is empty • An enqueue thread must add an object to the queue and notify blocked threads • Blocking queue must be thread safe
Blocking Queue dequeue • To dequeue an object from the queue: • Wait until the lock on the queue is obtained • If the queue is empty, release lock and sleep • If the queue is not empty, pop the first element and return it • To enqueue an object to the queue: • Wait until the lock on the queue is obtained • Pop the first element and return it • Notify any sleeping thread BlockingQueue.java
Thread Pool = threads + tasks • Thread pool = group of threads + queue of Runnabletasks • Thread pool starts by creating the group of threads • Each thread loops indefinitely • In every iteration, each thread attempts to dequeue a task from the task queue • If the task queue is empty, block on the queue • If a task is dequeued, run the task • Thread pool method execute(task) • simply adds the task to the task queue ThreadPool.java, ThreadPoolTest.java
Java thread pool API • Interface ExecutorServicedefines objects that run Runnabletasks • Using method execute() • Class Executors defines factory methods for obtaining a thread pool (i.e. an ExecutorService object) • newFixedThreadPool(n) creates a pool of n threads ExecutorService service = Executors.newFixedThreadPool(10); service.execute(newRunnable() { public void run() { // task code });
Back to working problem Users sammy ellie etc docs xyz.txt • Compute the total size of all regular files stored, • directly or indirectly, in a directory abc.txt foo.txt bar.txt
Modern Java Concurrent solution • Use Runnable objects • Create Runnable object for every (sub)directory • Use thread pool • Keeps the number of threads manageable • Keep overhead of thread creation low • Reuse threads Avoid sharing state • Variable totalSizeonly • Access must be synchronized Concurrent1.java Does not work
AtomicLong • Accumulator variable totalSizeis incremented by all threads • Must insure that the incrementing operation (the critical section) is not interrupted by a context switch • Solution 1: Use a Java lock to synchronize access to the critical section • Solution 2: Use class AtomicLong • method addAndGet() executes as a single atomic instruction
Concurrent1 problem • The main thread must wait until all (sub)directories have been processed • No way to know when that happens • Need to: • keep track of pending tasks, i.e. (directory processing) task creation and termination • Block the main thread until the number of pending tasks is 0
Modern Java Concurrent solution • Use Runnable objects • Create Runnable object for every (sub)directory • Use thread pool • Keeps the number of threads manageable • Keep overhead of thread creation low • Reuse threads Avoid sharing state • Variable totalSizeonly • Access must be synchronized Require synchronization variables • To terminate the application Concurrent2.java
CountDownLatch • Synchronization tool that allows one or more threads to wait until a set of operations being performed in other threads completes. • initialized with a given count • method await() blocks until count reaches 0 • method countdown() decrements count by 1 After count reaches 0, any subsequent invocations of await return immediately. • A CountDownLatch initialized with a count of 1 serves as a simple on/off gate: all threads invoking await wait at the gate until it is opened by a thread invoking countDown().
An Akka/Scala concurrent solution • Use Akka Actors • Task of processing a directory is given to a worker actor by a master actor • Worker actor processes directory • computes the total size of all the regular files and sends it to master • sends to master the (path)name of every sub-directory • Master actor • Initiates the process • sends tasks to worker actors • collects the total size • keeps track of pending tasks ConcurrentAkka.java
Akka • Actor-based concurrency framework • Provides solutions for non-blocking concurrency • Written in Scala, but also has Java API • Each actor has a state that is invisible to other actors • Each actor has a message queue • Actors receive and handle messages • sequentially, therefore no synchronization issues • Actors should rarely block • Actors are lightweight and asynchronous • 650 bytes • can have millions of actors running on a few threads on a single machine
Why use Akka in DSII? • Distributed computing • Actors do not share state and interact through messages • Actor locations (local vs remote) are transparent • Distributed transactions • Makes a sequence of messages atomic • Uses transactors, a combination of actors and Software Transactional Memory Fault tolerance • Implements “let-it-crash” semantics • Uses supervisor hierarchies that self-heal
Actors State • Supposed to be invisible to other actors Behavior • The actions to be taken in reaction to a message Mailbox • actors process messages from mailbox sequentially Children • Actors can create other actors • A hierarchy of actors Supervisor strategy • An actor is supervised by its parent
Actors • class First extends Actor { • def receive = { • case "hello" => println("Hello world!") • case msg: String => println("Got " + msg + " from " + sender) • case _ => println("Unknown message") • } • } • object Server extends App { • val system = ActorSystem("FirstExample") • val first = system.actorOf(Props[First], name = "first") • println("The path associated with first is " + first.path) • first ! "hello" • first ! "Goodbye" • first ! 4 • } First.scala
Using sbt Simple Build Tool (http://www.scala-sbt.org/) • Easy to set up Sample build.sbt configuration file name := "First Example" version := "1.0" scalaVersion := "2.10.4" resolvers += "Typesafe Repository" at "http://repo.typesafe.com/typesafe/releases/" libraryDependencies += "com.typesafe.akka" %% "akka-actor" % "2.3.1"
Abstract Class Actor • Extend Actor class and implement method receive Method receive should define case statements that • define the messages the actor handles • implement the logic of how messages are handled • use Scala pattern matching class First extends Actor { def receive = { case "hello" => println("Hello world!") case msg: String => println("Got " + msg) case _ => println("Unknown message") } }
Class ActorSystem • Actors form hierarchies, i.e. a system • Class ActorSystemencapsulates a hierarchy of actors • Class ActorSystemprovides methods for • creating actors • looking up actors. At least the first actor in the system is created using it
Class ActorContext • Class ActorContext also provides methods for • creating actors • looking up actors. Each actor has its own instance of ActorContext that allows it to create (child) actors and lookup ators
Obtaining actor references • Creating actors • ActorSystem.actorOf() • ActorContext.actorOf() Both methods return ActorRef reference to new actor Looking up existing actor by concrete path • ActorSystem.actorSelection() • ActorContext.actorSelection() Both methods return ActorSelection reference to new actor ActorRefor ActorSelectionreference can be used to send a message to the actor
Class ActorRef • Immutable and serializable handle to an actor • actor could be in the same ActorSystem, a different one, or even another, remote JVM • obtained from ActorSystem(or indirectly from ActorContext) • ActorRefs can be shared among actors by message passing • you can serialize it, send it over the wire and use it on a remote host and it will still be representing the same Actor on the original node, across the network. • In fact, every message carries the ActorRef of the sender • Message passing conversely is their only purpose
Class Props • Props is a ActorRefconfiguration object • Used when creating new actors through • ActorSystem.actorOf • ActorContext.actorOf
Sending messages • Messages are sent to an Actor through one of • method tell or simply ! • means “fire-and-forget”, e.g. send a message asynchronously and return immediately. • method ask or simply ? • sends a message asynchronously and returns a Future representing a possible reply • Message ordering is guaranteed on a per-sender basis Tell is the preferred way of sending messages. • No blocking waiting for a message • Best concurrency and scalability characteristics
Message ordering • For a given pair of actors, messages sent from the first to the second will be received in the order they were sent • Causality between messages is not guaranteed! • Actor A sends message M1 to actor C • Actor A then sends message M2 to actor B • Actor B forwards message M2 to actor C • Actor C may receive M1 and M2 in any order • Also, message delivery is “at-most-once delivery” • i.e. no guaranteed delivery
Message ordering • Akka also guarantees • The actor send rule • The send of the message to an actor happens before the receive of that message by the same actor. • The actor subsequent processing rule • processing of one message happens before processing of the next message by the same actor. • Both rules only apply for the same actor instance and are not valid if different actors are used
Messages and immutability • Messages can be any kind of object but have to be immutable. • Scala can’t enforce immutability (yet) so this has to be by convention. • Primitives like String, Int, Boolean are always immutable. • Apart from these the recommended approach is to use Scala case classes which are immutable (if you don’t explicitly expose the state) and works great with pattern matching at the receiver side • Other good messages types are scala.Tuple2, scala.List, scala.Map which are all immutable and great for pattern matching
Actor API • Scala trait (think partially implemented Java Interface) that defines one abstract method: receive() Offers useful references: • self: reference to the ActorRef of actor • sender: reference to sender Actor of the last received message • typically used for replying to messages • context: reference to ActorContext of actor that includes • factory methods to create child actors (actorOf) • system that the actor belongs to • parent supervisor • supervised children • etc.
Ping Pong examples Second.scala Third.scala
Scala pattern matching • Scala has a built-in general pattern matching mechanism • It allows to match on any sort of data with a first-match policy object MatchTest1 extends App { def matchTest(x: Int): String = x match { case 1 => "one" case 2 => "two" case _ => "many" } println(matchTest(3)) println(matchTest(2)) println(matchTest(1)) }
Scala pattern matching • Scala has a built-in general pattern matching mechanism • It allows to match on any sort of data with a first-match policy object MatchTest2 extends App { def matchTest(x: Any): Any = x match { case 1 => "one" case "two" => 2 case y: Int => "scala.Int: " + y } println(matchTest(1)) println(matchTest("two")) println(matchTest(3)) println(matchTest("four")) }
Scala case classes • Case classes are regular classes with special conveniences • automatically have factory methods with the name of the class • all constructor parameters become immutable public fields of the class • have natural implementations of toString, hashode, and equals • are serializable by default • provide a decomposition mechanism via pattern matching case class Start(secondPath : String) case object PING case object PONG
Scala pattern matching • Scala has a built-in general pattern matching mechanism • It allows to match on any sort of data with a first-match policy case class Start(secondPath : String) case object PING case object PONG object MatchTest3 extends App { def matchTest(x: Any): Any = x match { case Start(secondPath) => "got " + secondPath case PING => "got ping" case PONG => "got pong" } println(matchTest(Start("path"))) println(matchTest(PING)) }
Scala pattern matching • Scala has a built-in general pattern matching mechanism • It allows to match on any sort of data with a first-match policy object MatchTest4 extends App { def length [X] (xs:List[X]): Int = xs match { case Nil => 0 case y :: ys => 1 + length(ys) } println(length(List())) println(length(List(1,2))) println(length(List("one", "two", "three"))) }