1 / 19

Data Processing Using Madagascar and SCons

Explore the seamless integration of SCons and Madagascar systems for data processing. Learn Python basics, build configurations, and execute data processing tasks efficiently. Discover parallel processing capabilities and improve your workflow.

beatriceh
Download Presentation

Data Processing Using Madagascar and SCons

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. http://www.scons.org Data Processing Using Madagascar and SCons Sergey Fomel July 21, 2011 Beijing, China

  2. Madagascar Pyramid Test Present Implement

  3. Madagascar Pyramid Examples Papers Programs

  4. Outline • Introduction to SCons • Introduction to Python • Using SCons for compilation • Madagascar processing with SCons • data processing with rsf.proj • Paper processing with rsf.tex • Book processing with rsf.book

  5. Evolution of Build Systems • Make (1977) • “Sendmail and make are two well known programs that are pretty widely regarded as originally being debugged into existence. That's why their command languages are so poorly thought out and difficult to learn. It's not just you - everyone finds them troublesome.” P. van der Linden • Cake (1987) • GNU Make (1988) • SCons (2000) • … Zoltan Somogyi Stuart Feldman

  6. What is SCons? • Build system (Software Construction) • Written in Python • Configuration (SConstruct files) are Python scripts • Built-in support for different languages • Dependency analysis • Parallel builds • Cross-platform • … Steven Knight

  7. What is Python? • Dynamic programming language • Clear, readable syntax • “friendly and easy to learn” • Full modularity • “batteries included” • Integrates with other languages • “plays well with others” • Free and open-source

  8. Who uses Python? • "Python has been an important part of Google since the beginning, and remains so as the system grows and evolves. Today dozens of Google engineers use Python, and we're looking for more people with skills in this language." Peter Norvig, Google • “Python is fun, free, runs on a broad range of platforms and has a large library of sophisticated modules, including numerical. It meets all our criteria for a first language.” John Scales and Hans Ecke, CSM

  9. Python in 5 Easy Steps • Variables and strings • Lists and dictionaries • For loop • If/else, indentation • Functions and modules

  10. 1. Variables and Strings >>> a=‘Beijing, China’ >>> a[0] ‘B’ >>>a[9:] ‘China’ >>>b = a[:7] + “ is awesome” >>>print b ‘Beijing is awesome’ >>>a+5 Traceback (most recent call last): File “<stdin>”, line 1, module <module> TypeError: cannot concatenate ‘str’ and ‘int’ objects >>>a+str(5) ‘Beijing, China5’

  11. 2. List and Dictionaries >>> a=[‘Beijing’, ‘China’] >>> a[0] ‘Beijing’ >>>len(a) 2 >>>a.append(5) >>>a [‘Beijing’,’China’,5] >>>b = (‘Beijing’,’China’) >>>b.append(5) Traceback (most recent call last): File “<stdin>”, line 1, in <module> AttributeError: ‘tuple’ has no attribute ‘append’ >>>c = {‘city’: ‘Beijing’, ‘number’: 5}

  12. 3. For loop >>> a=(‘Beijing’, ‘China’) >>> for word in a: … print word, len(word) Beijing 7 China 5 >>>for k in range(2): … print k, a[k] 0 Beijing 1 China >>>c = {‘city’: ‘Beijing’, ‘number’: 5} for key in c.keys(): … print c[key] ‘Beijing’ 5

  13. 4. If/else, indentation >>> for k in range(4): >>> if k < 2: … print k … else: … print ‘no’ 0 1 no no >>>try: … a = ‘Beijing’ + 5 except: … print ‘error’ error

  14. 5. Functions and modules >>> def add_5(a): … ‘Add 5 to input’ … return 5+a >>> a = add_5(3) >>>a 8 >>>def add_b(a,b=5): ‘Add b to a’ … return b+a >>>add_b(a) 13 >>>import math >>>math.sqrt(add_b(a,8)) 4

  15. Outline • Introduction to SCons • Introduction to Python • Using SCons for compilation • Madagascar processing with SCons • data processing with rsf.proj • Paper processing with rsf.tex • Book processing with rsf.book

  16. SCons for Program Compilation env=Environment() env.Program(“program”, “file.c”) scons –h scons -n scons -c scons –Q scons -j

  17. Madagascar Processing: rsf.proj • Fetch(‘filename’,’dirname’) • A rule for downloading files from a server • Flow(‘target’,’source’,’command’) • A rule for making target from source • Plot(‘target’,’source’,’command’) • Like Flow but generates a figure file • Result(‘target’,’source’,’command’) • Like Plot but generates a final result

  18. Parallel Processing with pscons • Figures out the number of CPUs and runs scons –j • Use split= and reduce= to split data for simple data-parallel processing

  19. Summary • SCons is a powerful and convenient build system • Configuration files are Python scripts • Adopted by Madagascar for data processing and reproducible documents

More Related