280 likes | 432 Views
Reproducible Computational Experiments Using MADAGASCAR Software Package. Sergey Fomel Bureau of Economic Geology University of Texas at Austin. Applied Inverse Problems Vancouver BC June 29, 2007. http://rsf.sf.net/. Principles of Scientific Software. Encapsulation File Formats
E N D
Reproducible Computational Experiments Using MADAGASCAR Software Package Sergey Fomel Bureau of Economic Geology University of Texas at Austin Applied Inverse Problems Vancouver BC June 29, 2007 http://rsf.sf.net/
Principles of Scientific Software • Encapsulation • File Formats • Testing • Reproducibility • Maintenance http://rsf.sourceforge.net/
Principles of Scientific Software • Encapsulation • File Formats • Testing • Reproducibility • Maintenance http://rsf.sourceforge.net/
Encapsulation • Information hiding (Parnas, 1972) • Separation of concerns (Dijkstra, 1974) • Separate physics from mathematics • A is physics • Going from b to is mathematics http://rsf.sourceforge.net/
Example: Velocity Transform http://rsf.sourceforge.net/
Physics of Velocity Transform http://rsf.sourceforge.net/
Encapsulation in Programming • Separation of concerns • Classes or templates (C++) • Function pointers (C) • Function interfaces (Fortran-90) /* initialize velocity transform (A) */ veltran_init (true, x0, dx, nx, s0, ds, nv, o1, d1, nt, s02, anti, psun1, psun2); /* least-squares minimization of |A x – b|^2, x=vscan, b=cmp */ sf_solver (veltran_lop, sf_cgstep, ntv, ntx, vscan, cmp, niter, "err", error, "nmem", 0, "nfreq", miter, "mwt", mask, "end"); http://rsf.sourceforge.net/
Encapsulation in UNIX • Write programs that do one thing and do it well. • Write programs to work together. • Write programs to handle text streams, because that is a universal interface. http://rsf.sourceforge.net/
Encapsulation in UNIX Shell bash$ sfveltran < cmp.rsf > vtran.rsf adj=y v0=1 dv=0.025 nv=60 bash$ sfdottest sfveltran mod=vtran.rsf dat=cmp.rsf v0=1 dv=0.025 nv=60 sfdottest: L[m]*d=21665.9 sfdottest: L'[d]*m=21665.9 bash$ sfdottest sfveltran mod=vtran.rsf dat=cmp.rsf v0=1 dv=0.025 nv=60 sfdottest: L[m]*d=21906.2 sfdottest: L'[d]*m=21906.2 bash$ sfconjgrad sfveltran < cmp.rsf > vtran.rsf niter=3 v0=1 dv=0.025 nv=60 sfconjgrad: iter 1 of 3 sfconjgrad: grad=6.36797e+09 sfconjgrad: iter 2 of 3 sfconjgrad: grad=1.39068e+09 sfconjgrad: iter 3 of 3 sfconjgrad: grad=7.50257e+08 http://rsf.sourceforge.net/
Principles of Scientific Software • Encapsulation • File Formats • Testing • Reproducibility • Maintenance http://rsf.sourceforge.net/
The Art of UNIX Programming • (Raymond, 2004) • To design a perfect anti-Unix, make all file formats binary and opaque, and require heavyweight tools to read and edit them. • If you feel an urge to design a complex binary file format, or a complex binary application protocol, it is generally wise to lie down until the feeling passes. http://rsf.sourceforge.net/
n1=1000 in=“/path/data.rsf@” n2=500 n3=100 d1=0.001 d2=0.1 o2=1 Data RSF (Regularly Sampled Format) • SEPlib (Stanford Exploration Project) • Data separated from text headers • Conceptually N-dimensional hypercubes • Multiple files for complex geometries • Not application specific http://rsf.sourceforge.net/
Principles of Scientific Software • Encapsulation • File Formats • Testing • Reproducibility • Maintenance http://rsf.sourceforge.net/
Testing • Test-driven development (Beck, 2003) • YAGNI principle • Always implement things when you actually need them, never when you just foresee that you need them. • In scientific software development, tests are computational experiments http://rsf.sourceforge.net/
http://www.scons.org Testing with SCons • Software Construction • Replacement for “make” • reliable and extensible dependency analysis • configuration files are Python scripts • cross-platform • open-source http://rsf.sourceforge.net/
SConstruct File # Mobil AVO CMP gather 807 at well4 location Fetch('cmp807_raw.HH','rad') # Preprocessing Flow('cmp','cmp807_raw.HH', 'dd form=native | tpow tpow=2 | mutter half=n v0=1.3 tp=0.2') Plot('cmp','grey title="Input CMP Gather" ‘) # Velocity Transform Flow('veltran','cmp','veltran s02=0.25 v0=1.250 dv=0.025 nv=60 adj=y') Plot('veltran','grey title="Velocity Scan" ') # Display Side by Side Result('veltran','cmp veltran','SideBySideAniso') http://rsf.sourceforge.net/
Experimenting with SCons bash$ scons retrieve(["cmp807_raw.HH"], []) < cmp807_raw.HH sfdd form=native | sftpow tpow=2 | sfmutter half=n v0=1.3 tp=0.2 > cmp.rsf < cmp.rsf sfgrey title="Input CMP Gather" > cmp.vpl < cmp.rsf sfveltran s02=0.25 v0=1.250 dv=0.025 nv=60 adj=y > veltran.rsf < veltran.rsf sfgrey title="Velocity Scan" > veltran.vpl vppen yscale=2 vpstyle=n gridnum=2,1 cmp.vpl veltran.vpl > Fig/veltran.vpl bash$ sed s/Velocity/Slowness/ < SConstruct > SConstruct2 bash$ mv SConstruct2 SConstruct bash$ scons < veltran.rsf sfgrey title=“Slowness Scan" > veltran.vpl vppen yscale=2 vpstyle=n gridnum=2,1 cmp.vpl veltran.vpl > Fig/veltran.vpl http://rsf.sourceforge.net/
Principles of Scientific Software • Encapsulation • File Formats • Testing • Reproducibility • Maintenance http://rsf.sourceforge.net/
Reproducible Research at Stanford • (Knuth, 1992) • A computer program should be written with human readability as a primary goal. • (Claerbout and Karrenbach, 1992) • The purpose of reproducible research is to facilitate someone going a step further by changing something. • (Buckheit and Donoho, 1995) • An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. http://rsf.sourceforge.net/
Reproducible Experiments • Within the world of science, computation is now rightly seen as a third vertex of a triangle complementing experiment and theory. However, as it is now often practiced, one can make a good case that computing is the last refuge of the scientific scoundrel […] Where else in science can one get away with publishing observations that are claimed to prove a theory or illustrate the success of a technique without having to give a careful description of the methods used, in sufficient detail that others can attempt to repeat the experiment?(LeVeque, 2006) http://rsf.sourceforge.net/
Principles of Scientific Software • Encapsulation • File Formats • Testing • Reproducibility • Maintenance http://rsf.sourceforge.net/
Maintenance • Computational experiments that are not continuously maintained loose reproducibility. • Regression testing (Brooks, 1975) • Contribute computational software and experiments to a community-maintained repository to enable research productivity. http://rsf.sourceforge.net/
Open Science http://rsf.sourceforge.net/
http://rsf.sf.net/ Conclusions • Principles of Scientific Software • Encapsulation • File Formats • Testing • Reproducibility • Maintenance • Madagascar software package • Open source, open community, open science http://rsf.sourceforge.net/