1 / 7

Python Lab Pandas Library - II

Python Lab Pandas Library - II. Proteomics Informatics, Spring 2014 Week 8 18 th Mar, 2014 Himanshu.Grover@nyumc.org. From last w eek…. Today. Differential Gene Expression example leukemia dataset ( Golub et. a l. ) http://www.biolab.si/supp/bi-cancer/projections /. HW-3a.

denali
Download Presentation

Python Lab Pandas Library - II

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Python LabPandas Library - II Proteomics Informatics, Spring 2014 Week 8 18th Mar, 2014 Himanshu.Grover@nyumc.org

  2. From last week…

  3. Today • Differential Gene Expression example • leukemia dataset (Golub et. al. ) • http://www.biolab.si/supp/bi-cancer/projections/

  4. HW-3a • For top-5 differentially expressed genes (according to p-value) • compute pairwise correlation matrix (it will actually be a data frame) • Check out the documentation of <DataFrame>.corr() • (exprDF.corr? in our example) • Hint: • Use exprDF.ix[:, <columns>] to get dataframe with only top-5 genes • Apply corr to this • Save it to another variable • Write it to a file: • Check out <DataFrame>.to_csv function

  5. HW-3b • For top-5 genes output summary statistics to a file • Hint: • Check out <DataFrame>.describe function • This outputs a dataframe. Save it to a variable and then to a file • Similar ideas from previous part

  6. HW-3c • Add a new function for preprocessing all gene expression values before applying t-test • For every gene (x), perform: • (x-mean(x))/std(x, ddof=1) • Return this value • Use apply() as discussed in the case of applying t-test • Note that this is similar to the “perform_tTest_two()” case, since this operation on each column returns back the whole column (Series) of values back.

  7. References • Book: Python for Data Analysis

More Related