150 likes | 307 Views
Python for Science. Shane Grigsby. What is python? Why python?. Interpreted, object oriented language Free and open source Focus is on readability Fast to write and debug code Large community Lots of documentation Lots of packages… General purpose language. The python scientific stack:.
E N D
Python for Science Shane Grigsby
What is python? Why python? • Interpreted, object oriented language • Free and open source • Focus is on readability • Fast to write and debug code • Large community • Lots of documentation • Lots of packages… • General purpose language
Python: Fast to write, slow to run? • Depends on how you use it– if something is slow there is probably a faster way to do it! Are you using a numeric library? Are you slicing through arrays, or looping lists? Is your code vectorized? Numpy calls fortran code to do array operations Many other libraries call C for operations… …or have functions written in both C and python e.g., scipy.spatial.kdtree vs. scipy.spatial.cKDTree
How is python different from MATLAB? • Indexing starts at 0 • Space delimited • Default behavior is element-by-element when dealing with arrays • Functions use ()’s, indexes use []’s, tuples and dictionaries use {}’s • You don’t have to use a ‘;’ on every command • Object oriented See also: http://mathesaurus.sourceforge.net/matlab-python-xref.pdf
Today’s tutorial • Intro to the scientific stack • Importing modules • Intro to the development environments: • Spyder • iPython • Indexing • Defining functions • Plotting and graphics • Intro to data structures • Basic looping (maybe…) • Additional pandas • data import from clipboard • time series (on your own) Notebooks: • SARP python tutorial • Prism Data • Regression loops Files: • Leaf_Angles.csv • good_12_leafangle.h5 • monthly.nc# Optional • *_spec.txt
Terminals and Prompts • We’ll use python and python tools from three different ‘prompts’: • The system prompt (i.e., cmd) • From spyder • From the iPython Notebook • Note that these will all run the same version of python, but with slightly different behaviors
Notebooks: • SARP python tutorial • Prism Data • Regression loops Files: • Leaf_Angles.csv • good_12_leafangle.h5 • monthly.nc# Optional
Imports Basic python is sparse… …but we can import! import tables import numpy as np from pylab import * From scipy import ones, array
Example time: Using the iPython notebook • Notes: • ‘%’s are specific to the iPython NB’s; they won’t work in your terminal • We’ll use: %pylab inline • This doesn’t work for 3D or interactive plots (yet) • Use spyderor ipython (without the notebook) to access interactive graphics.
Imports • Pull from the python install directory first • i.e., lib/python2.7/site-packages • Pull from the current directory second • Conflicting imports are replaced with the last import
Defining a function from scipy.constants import * defXwave(wavelength, temp, unit=1): X_wave = (h*c)/(k*(wavelength*unit)*temp) returnX_wave defLwave(wavelength, temp, unit=1): """Calculates L given wavelength and Temp To get M, multiply by pi Note that units are: W * m**-2 * sr**-1 * m**-1 I.e, unitsaregiven in meter of spectrum multiply by nm to get: W * m**-2 *sr**-1 nm**-1""” X_funct= Xwave(wavelength, temp, unit) L=(2*h*(c**2))/(((wavelength*unit)**5)*(exp(X_funct)-1)) return L • Multiline comment • Keyword arguments • can use default values • Definition syntax • return is optional • Constants defined at the top of the script • Top line brings in physical constants, so we don’t have to define them ourselves… ang = 1E-10 nm = 1E-9 um = 1E-6 Cm = 1E-2 hH = 1.0 kH = 1E3 mH = 1E6 gH = 1E9 tH = 1E12
Defining Functions • Functions are defined using ‘def’, function name, ‘()’’s with parameters, and an ending ‘:’ • The function body is demarcated using white space (as in for loops) • Functions are called using the function name, ‘()’’s, and input parameters • Note that the input parameters don’t have to match the names that requested…
Looping in python for i in range(len(list)): # NO… print list[i] # executes, but is wrong foritem inlist: print item • ‘item’ is a variable name; it is not declared in advance; it is arbitrary (i.e., could be ‘i’, ‘point’, or some other label). There could also be more them one variable here—see next slide… • ‘for’ and ‘in’ are syntactically required; they bracket our variables. • ‘list’ could be another data structure (dictionary, array, etc.), or could be a function in advanced use. • Note: else andelseifare not required, but can be used • Note: the white space is required—either a tab or four spaces • Note: we don’t need to know how many items are in the data structure
A more advanced loop: from liblas import file import scipy f = file.File('/Users/grigsbye/Downloads/Alameda_park_trees_pts.las',mode='r') treeData = scipy.ones((len(f),3)) for i, p in enumerate(f): treeData[i,0], treeData[i,1], treeData[i,2] = p.x,p.y,p.z • First line imports a special module to read .las files • Third line reads a .las file into a python object we can loop over • Fourth line creates a scipy/numpy array to hold our data • ‘Enumerate’ returns an index number for each point that we are looping over • The last line assigns the x,y, and z values to our array, incrementing the index by one with each loop • For a more complete guide to looping, see: http://nedbatchelder.com/text/iter.html