120 likes | 148 Views
Understand the concept of problem formulation in bioinformatics projects through examples involving microarray data analysis. Explore how to frame an intriguing question to engage readers and editors effectively.
E N D
Project Work Problem formulation
What is a Problem formulation A short description of an objective. Typically, formulated as a question. The “Problem formulation” is distinct from the frames of reference and outline.
Problem formulation what do we mean Well defined ‘question’ that may be addressed in a project (6-7 days). It must involve microarray data in some way It must be ‘exiting’
Exiting?!!! that’s very subjective True, - and we will be the judges of that Why: we got too bored reading boring project reports about differentially expressed genes. ... And so are most editors on scientific journals
An example from real life Dear Dr Nielsen Thank you very much for submitting your manuscript ”xxx" for publication in Nucleic Acids Research. However, it has become the policy not to publish manuscripts that focuses on newly discovered differential genes … I am sorry that my decision cannot be more positive at this time Sincerely XXXX XXXXXXX
Questions that are general are generally more interesting Two simple cases: • What kind of genes are differentially expressed under condition X? • Does two or more different conditions, e.g. diseases, result in similar transcriptional response? - Can we predict the outcome of a disease/treatment based on transcriptional profiles
Technically questions may also be interesting Examples: • Which statistical method is best • Does Mismatch probes help?
Biological questions are nice Example: • Are proteins primarily produced when needed during the cell cycle? • Are gene X epistatic to gene Y? • Are the gene influenced by factor X evenly distributed across the chromosomes?
Example from last year Problem formulation: The treatment of paediatric T-Cell leukaemia is largely successful, yet a subgroup of patients experience relapse after treatment. We want to determine if there are any similarities between gene expression of the recovered/relapsed patients and based on this to build a classifier to predict the prognosis of newly diagnosed patients. Furthermore we will benchmark this classifier against a classifier built on a feature list, containing genes involved in drug resistance.
Example II from last year Problem formulation: Are the targets of human miRNAs co-regulated within different tissue types? Furthermore, are the target genes functional related?
Example III from last year Problem formulation: Are lung expression profile reversed when one stops smoking? Outline The overall goal of the project will be to build a classifier based on the dataset available for the 34 current smokers and the 23 never smokers. This prediction model will be used to classify the 18 former smokers accordingly. Clinical data will be included as necessary in order to characterize biological differences. Data from: Spira A, et al. (2004) Effects of cigarette smoke on the human airway epithelial cell transcriptome. PNAS 101(27):10143-8.
Some data sets Was presented This morning And will be discussed on Friday and Tuesday