150 likes | 390 Views
HaploReg , RegulomeDB and more on Python programming. Lin Liu Yang Li. HaploReg retrieves the ENCODE annotation for the selected SNP, as well as other SNPs in LD
E N D
HaploReg, RegulomeDB and more on Python programming Lin Liu Yang Li
HaploReg retrieves the ENCODE annotation for the selected SNP, as well as other SNPs in LD • Using the “Set Options” tab, the user can configure values such as the LD threshold and the population used from 1000 Genomes data used to calculate LD
Python programming wrap-up • if else • for and while loop • index: starts from 0, different from R • four important data structure: • list: a = [1, 2, 3, 4]; a.append(5) • tuple: a = (‘cat’, ‘dog’); a[0], a[1] = a[1], a[0] • dictionary: a = {‘chr1’:{10254:’G’, 13257:’T’}}; a.keys(); • sets: • from sets import Set • species = Set([‘hs’, ‘mm’, ‘chimp’]) • zoos = Set([‘mm’, ‘wolf’, ‘chimp’]) • zoos | species • zoos & species • zoos - species
Some tricky fact: • Shallow copy and deep copy • Shallow copy: a = [1,2,3]; b = a; b[2] = 4; print(a) • Deep copy: • from copy import deepcopy • a = [1, 2, 3]; b = deepcopy(a); b[2] = 4; print(a) • List comprehension: • Like in R: loops are slow slow slow • a = [1, 2, 3]; a = [b + 1 for b in a]; print(a)
How to read bam (binary) files in python? • import pybedtools • How to perform numerical computation in python? • import numpy as np • Include array and matrix calculation, very useful • How to use shell script in python? • Get all files in a folder • import os • os.listdir(“yourdirectory”)
Object oriented programming • Class and objects in python class HMM: #constructor #transition_probs[i, j] is the probability of transitioning to state i from state j #emission_probs[i, j] is the probability of emitting emission j while in state i def __init__(self, transition_probs, emission_probs): self._transition_probs = transition_probs self._emission_probs = emission_probs #accessors defemission_dist(self, emission): return self._emission_probs[:, emission] @property defnum_states(self): return self._transition_probs.shape[0] @property deftransition_probs(self): return self._transition_probs
Interface with other programming language • Rpy: R and python interface • cygwin: python and C interface • When to use python? • Text manipulation • Some simple machine learning implementation (like using matlab) • Some very well-written package available: PyStan (Bayesian MCMC sampler), matlablib, pybedtoolsetc
When not to use python: • Large scale simulation: most often you cannot get rid of loops • Statistical analysis: R is much better and well curated • Best strategy: C interface python
Some good reference code for python • Check MACS14 python script • You can learn how to write a python script into an executable software from MACS14