280 likes | 408 Views
Scalab: a Build Tool for Scala. Master Thesis July 4, 2008 Author: Vincent Pazeller Supervisor: Gilles Dubochet Professor: Martin Odersky Programming Methods Laboratory / LAMP. Outline. Interest of Build Tools Interest of Scalab Definition of a Build Tool Model Caches
E N D
Scalab: a Build Tool for Scala Master Thesis July 4, 2008 Author: Vincent Pazeller Supervisor: Gilles Dubochet Professor: Martin Odersky Programming Methods Laboratory / LAMP
Outline • Interest of Build Tools • Interest of Scalab • Definition of a Build Tool • Model • Caches • Internal Operation (update) • Sabbus • Further Work
Build Tools Interest • Build Process: sequence of tasks that transform the sources of a project into its executable equivalent. • All tasks do not always need to be executed. • Sources may not have all been modified. A build tool automates the choice of tasks to be executed and optimizes the build process. Increases developers’ productivity.
Interest of Scalab • Existing tools make non-conservative approximations. • Makes it possible to describe situations that cannot be described with any other build system. • They are also too difficult to use: • Sabbus written with Ant 1200 lines of XML. • < 100 (reasonable) lines of Scala code with Scalab.
Task • A task employs sources to produce products. • Sources and products are resources (i.e. files). • The universe is the set of all resources.
Up-to-date • Given a set of tasks , a task is up-to-date with respect to when the products of cannot be modified by the execution of any sequence of tasks, ∀i.i∈ ∧ i∈ The purpose of a build tool is to make up-to-date ∀ ∈ The build tool needs to know (at run-time) the sources and the products of each task.
Model: universe Static representation of a common Java project: Note: all resources depend directly on the universe. Idea: make them depend on the task that created them instead (more dynamic):
Model: filters • Idea: indicate in the graph how resources can be extracted from the universe. • The build tool can detect changes dynamically. • Filters are sub-divided in three categories: • Selectors • Scanners • Mappers • Filters can also be used to filter the products of components (tasks and filters, so far).
Model: Pipes • We can now simplify the graph: Becomes Arrows are called pipes and carry resources from component to component.
Model: Gates • Inputs • Interest: distinguish subsets of sources. • Mandatory (■) or optional (□). • Output: each component has a single output which is implicit.
Model: Black Boxes • Interest: • hide and/or make sub-graph re-usable/distributable. • Reduces the risk of errors • The inside looks like: • The output must be explicitly provisioned, this time. Black boxes behave exactly as if they were tasks.
Model: Build Schemes • Generalization of black boxes. • This scheme can then be used with any compiler and any archiver. • The class-path input has been omitted because it cannot be generalized to all compilers. • Build schemes are black-box generators.
Model: Targets • Purpose: indicate clearly which components are relevant to build. • Interests: • Build process is more explicit. • Avoid any misuse.
Model: Dependencies • Hard dependencies: • if a source hard dependency of a task does not exist, the task cannot be executed. • A product hard dependency indicates that the resource has been created (no doubt). • Soft dependencies: • The absence of a source soft dependency does not prevent the task to execute. • A product soft dependency suggests a doubt on the resource creation. Pipes can form cycles in the build graph! • Filters are not affected by (hard) dependencies.
Caches • Interest: avoid that tasks repeat the work they have done in the past. • Principle: load/store resources from/to repository. • Tasks write directly in the cache repository (no copy).
Caches: Behavior • The behavior of a cache is defined by: • Change Detection Policy: used to detect when a source has changed. • Eviction Policy: used to select and delete the least pertinent information in a cache. • Core Policy: Defines how the cache loads and stores information and coordinates the three policies.
Scalac Caller .class Caller .scala Callee .scala Caller$ .class Callee$ .class Callee .class Caches: Conservativeness • Caches can be conservative or not • Conservative caches ensure that the result of cached tasks is always sound. Conservative caches need to know inter-dependencies among resources. Caller.scala: object Caller{ def main(args: Array[String]){ Callee.invoke } } Callee.scala: object Callee{ def invoke{} }
Update • First try: • Wrong! If the graph contains a cycle, the algorithm will never terminate! traitExecutableComponent{ … protecteddef update: Boolean = this.inputs forall {i => i.providers forall {p => p.update} } && this.exec … }
Update: Cycle Detection protected def update(visited: Set[Component], cycles: Set[(Output, Input)]): (Boolean, Set[Component], Set[(Output, Input)]) = { var newVisited = visited + this //add this node to the visited set var newCycles = cycles val inputsUpdated = this.inputs forall {i => i.providers forall {p => if(newCycles contains Pair(p.output, i)) //ensures termination true else{ if(visited contains p) //cycle detection newCycles = newCycles + Pair(p.output, i) val (updated, moreVisited, moreCycles) = p.update(newVisited, newCycles) //update providers newVisited = newVisited ++ moreVisited newCycles = newCycles ++ moreCycles updated } } //input providers are up-to-date } //inputs are up-to-date (inputsUpdated && this.exec, newVisited, newCycles) //update this component }
Update: Redundancy • Presented update algorithm is not optimal: Need to remember which components were updated. in0 in1
Update: Efficient Version protected def update(visited: Set[Component], cycles: Set[(Output, Input)], updated: Set[Component]): (Boolean, Set[Component], Set[(Output, Input)], Set[Component]) = { if(updated contains this) //avoid redundant updates (true, visited, cycles, updated) else{ var newVisited = visited + this //add this node to the visited set var newCycles = cycles var newUpdated = updated val inputsUpdated = this.inputs forall {i => i.providers forall {p => if(newCycles contains Pair(p.output, i)) //ensures termination true else{ if(visited contains p) //cycle detection newCycles = newCycles + Pair(p.output, i) val (updated, moreVisited, moreCycles, moreUpdated) = p.update(newVisited, newCycles, newUpdated) newVisited = newVisited ++ moreVisited newCycles = newCycles ++ moreCycles newUpdated = newUpdated ++ moreUpdated updated } } //input providers are up-to-date } //inputs are up-to-date (inputsUpdated && this.exec, newVisited, newCycles, newUpdated + this) //update this component } }
Sabbus: Preamble 1 val scalaHome = “/home/pazeller/pdm/scala/” 2 val scalaSrcs = scalaHome + “srcs/” 3 val scalacSrcs = scalaSrcs + “compiler/” 4 val scalaLib = scalaHome + “lib/” //sources 5 val compilerSrcs = Universe -> Files(scalacSrcs) -> ListDirs() -> EndsWith(“.scala”) -> 6 StartsWith(scalacSrcs + “scala/tools/ant/”).complement 7 val libSrcs = Universe -> Files(scalaSrcs + “library/”) -> ListDirs() -> EndsWith(“.scala”) //old classes 8 val oldLib = Files(scalaLib + “scala-library.jar”) 9 val bytecodeGen = Files(scalaLib + “fjbg.jar”) // bytecode generator 10 val oldScalac = Files(scalaLib + “scala-compiler.jar”) 11 Universe >> (oldLib, oldScalac, bytecodeGen)
Sabbus: Instantiating Compilers //Building compilers 12 val Starr = DynamicScalac(“starr”) 13val StarrLib = DynamicScalac(“starrLib”) 14val locker = DynamicScalac(“locker”) 15val lockerLib = DynamicScalac(“lockerLib”) 16val quick = DynamicScalac(“quick”) 17val quickLib = DynamicScalac(“quickLib”) //grouping 18val compilers = List(starr, locker, quick) 19val libCompilers = List(starrLib, lockerLib, quickLib) 20val allCompilers = compilers ::: libCompilers
Sabbus: Connecting Pipes //sources 21 newLibSrcs >> (libCompilers map {c => c.src}) 22 newCompilerSrcs >> (compilers map {c => c.src}) //loading classes 23 bytecodeGen >> (allCompilers map {c => c.load}) 24 oldCompiler >> (starrLib.load, starr.load) 25 starr.runDirectory >> (lockerLib.load, locker.load) 26 locker.runDirectory >> (quickLib.load, quick.load) //libraries 27 oldLib >> (starrLib.load, starr.load) 28 starrLib.runDirectory >> (starr.boot, lockerLib.load, locker.load) 29 lockerLib.runDirectory >> (locker.boot, quickLib.load, quick.load) 30 quickLib.runDirectory -> quick.boot
Sabbus: Concluding 31 Stopwatch(allCompilers) //timing compilers executions 32 val stability = SameContent(“stability”, locker, quick) //stability check 33 val distr = Jar(“jar”, “scala-compiler.jar”, stability) //setting cache 34 val cache = new ConservativeCCP with TimestampCDP with LRUCEP 35 cache.setCacheDirectory(“/home/pazeller/shared/cache/”) 36 allCompilers foreach {c => c.setCache(cache)} //targets 37 val buildDir = “./build” 38 val newDistribution = Target(buildDir + “distr/”, distr) 39 val newLibrary = Target(buildDir + “library/”, lockerLib) 40 val newCompiler = Target(buildDir + “compiler/”, locker) 41 val default = newCompiler //default target
Further Work • Parallel Task Execution • Graphical Interface • Interactive Debug Mode • Filters Caching • Automatic Graph Dismantling • Extending Library
Thank you for your attention. • Project can be consulted on http://scalab.googlecode.com • Feel free to ask questions.