280 likes | 288 Views
Lecture 8. Some further unix & python tips Good programming habits Catalog cross-matching Bayesian variability test. More Unix tips and tricks: redirection. How to redirect your output: If I do I get etc. But if I do I apparently don’t get anything. But! Look inside file99: etc. $ ls.
E N D
Lecture 8 • Some further unix & python tips • Good programming habits • Catalog cross-matching • Bayesian variability test
More Unix tips and tricks: redirection. How to redirect your output: • If I do • I get • etc. But if I do • I apparently don’t get anything. But! Look inside file99: • etc. $ ls benne.html Desktop Examples pic26867.jpg $ ls > file99 $ more file99 benne.html Desktop
More Unix tips and tricks: redirection. • The > character redirects the output of ls (or any other command) from the screen to the named file. • Careful! It will overwrite any pre-existing file of that name. • If you just want to add stuff to the end of an existing file, use >> instead of >. • To redirect errors, you have to be more elaborate. But we won’t worry about that. • If you want to send the output to another command, use | (called a ‘pipe’). Eg a2ps | lp -dljuct
Dots in Unix. A dot . in Unix has several different functions, depending on context. • Here it is just part of a filename: • Here is the first character in the name of a ‘hidden’ file (ls won’t list these – but ls –a will): • Here it means run the script called fred: • Here it means the present working directory: bignob.py .bashrc $ . fred Note the space! $ export PATH=/usr/bin:. $ ls ./*
Double dots in Unix. Two dots in a row means the directory 1 higher up the tree from here. It can save you some typing. Eg the following: • If I want to cd back to comp_astron, there’s two ways to do it. The long way: • Or the short way: $ pwd /home/ims/doc/teaching/comp_astron $ ls code latex figures $ cd code $ pwd /home/ims/doc/teaching/comp_astron/code $ cd /home/ims/doc/teaching/comp_astron $ cd ../
The tilde ~ • This is another symbol which has a meaning which depends on context. • Here it means the file is a backup. The shell makes these under certain conditions. They record the file before your last change. Obviously this can be useful! • Here it is shorthand for your home directory. • As an aside, cd with no arguments always takes you direct to home: some_file_name~ $ cd ~/doc/teaching $ cd
Another few cm of the python. • The range function – it is a bit tricky. Note the following: • Function returns. Note: • This is a tuple. It is better to be explicit... >>> print range(3) [0, 1, 2] >>> print range(0,3) [0, 1, 2] >>> def some_func(arg1,arg2): ... a = arg1 + arg2 ... b = arg1 – arg2 ... return a, b ... >>> c = some_func(42, 2) >>> print c (44, 40)
Another few cm of the python. • Also in the function call: • is better than • You can also force it to return a list: >>> def some_func(arg1,arg2): ... a = arg1 + arg2 ... b = arg1 – arg2 ... return (a, b) >>> (summ, diff) = some_func(42, 2) >>> c = some_func(42, 2) >>> def anuther_func(arg1,arg2): ... a = arg1 + arg2 ... b = arg1 – arg2 ... return [a, b] ... >>> c = anuther_func(20, 19) >>> print c [39, 1]
Another few cm of the python. • The for statement - most of you know this way: • But you can also do this: • Saves a bit of typing. >>> mylist = [‘peas’,’beans’,’dead dog’] >>> for i in range(len(mylist)): ... item = mylist[i] ... print item ... peas beans dead dog >>> mylist = [‘peas’,’beans’,’dead dog’] >>> for item in mylist: ... print item ... peas beans dead dog
Good vs.EVIL programming habits. • The good: • Make your code as unspecialized as possible. • Chunk tasks into functions and modules. • The bad: • GOTO – python has abolished this statement. • Equality testing against reals. Eg • Only do this against zero. • Leaving some values untested, eg: if (myvar == 7.2): if (nuddervar < 5.9): # do something elif (nuddervar > 5.9): # do something else
Good vs.EVIL programming habits. • The bad continued: • Changing function arguments. Eg • This is actually only ugly...BAD is doing it with lists. Try the following: • Using reserved words for variable names. • The ugly: • Too-short variable names. def hobbit_state(bilbo, sam): sam = ‘hungry’ >>> def moonshine(list_of_moons): ... list_of_moons.append(‘half’) ... >>> mondlicht = [‘new’,’blue’] >>> moonshine(mondlicht) >>> print mondlicht
Catalog cross-matching • It sometimes happens that you have 2 lists of objects, which you want to cross-match. • Maybe the lists are sources observed at different frequencies. • The situation also arises in simulations. • I’ll deal with the simulations situation first, because it is easier. • So: we start with a bunch of simulated sources. Let’s keep it simple and assume they all have the same brightness. • We add noise, then see how many we can find.
Catalog cross-matching • In order to know how well our source-detection machinery is working, we need to match each detection with one of the input sources. • How do we do this? • How do we know the ‘matched’ source is the ‘right one’? CAVEAT: ...I haven’t done a rigorous search of the literature yet – these are just my own ideas.
Catalog cross-matching Black: simulated sources Red: 1 of many detections (with 68% confidence interval). This case seems clear.
Catalog cross-matching But what about these cases? No matches inside confidence interval. Too many matches inside confidence interval.
Catalog cross-matching Or these? Is any a good match? Which is ‘nearest’?
Catalog cross-matching • My conclusion: • The shape of the confidence intervals affects which source is ‘nearest’. • The size of the confidence intervals has nothing to do with the probability that the ‘nearest’ match is non-random. • ‘Nearest neighbour’ turns out to be a slipperier concept than we at first think. To see this, imagine that we have now 1 spatial dimension and 1 flux dimension:
Catalog cross-matching Which is the best match??? Source 5? Or source 8? This makes more sense. Let’s then define r as: 9 8 7 6 5 4 S S 9 8 2 3 6 7 5 1 4 2 3 1 1 1 x x
Catalog cross-matching • As for the probability... well, what is the null hypothesis in this case? • Answer: that the two catalogs have no relation to each other. • So, we want the probability that, with a random distribution of the simulated sources, a source would lie as close or closer to the detected source than rnearest. • This is given by: • where ρ is the expected density of sim sources and V is the volume inside rnearest. Pnull = 1 – exp(-ρV)
Catalog cross-matching • So the procedure for matching to a simulated catalog is: • For each detection, find the input source for which r is smallest. • Calculate the probability of the null hypothesis from Pnull = 1 – exp(-ρV). • Discard those sources for which Pnull is greater than a pre-decided (low) cutoff. • What about the general situation of matching between different catalogs?
Catalog cross-matching ASCA data – M Akiyama et al (2003)
Bayesian variability analysis J D Scargle (1998)