150 likes | 239 Views
Shared Genomics: . Developing an accessible integrated analysis platform for Genome-Wide Association Studies. David Hoyle, Mark Delderfield , Lee Kitching, Gareth Smith, Iain Buchan North West Institute for BioHealth Informatics, University of Manchester. Outline.
E N D
Shared Genomics: • Developing an accessible integrated analysis platform for Genome-Wide Association Studies • David Hoyle, Mark Delderfield, • Lee Kitching, Gareth Smith, Iain Buchan North West Institute for BioHealth Informatics, University of Manchester
Outline • Introduction – Scientific task • – Shared Genomics objectives • Overview of the workbench • Demo of key features • Future development options • Download link and chance to feedback
Scientific Task Identify genetic variations associated with disease outcomes, and the plausible biological mechanisms. Taking into account gene-environment interactions. • Manchester Asthma and Allergy Study – Birth Cohort • Salford Diabetes – Electronic Patient Records c2 For example, 1000 subjects , 0.6M SNPs and over 1000 clinical variables. No disease Disease
Shared Genomics Platform To design, build and implement an information system to help researchers efficiently analyze large-scale genetic data. For example , Genome-wide SNP pair associations = 0.6M*0.6M*10K/2 tests • Solution: • Deploy parallelised analysis algorithms on a High Performance Cluster • Provide an accessible workbench for clinicians
Shared Genomics Workbench • Support pre-processing & QC process • Run large scale analyses quickly • ‘One-click’ annotation of your bio-markers (SNPs) • Tool to explore automatically tracked annotations
Support pre-processing & QC • Simplifies pre-processing – generates analysis input files • Data can be filtered on • SNPs e.g., • Hardy Weinberg Equilibrium • Minor Allele Frequency (MAF) • Missing rate per SNP • Individuals e.g., • Missing rate per person • Covariates, e.g. gender, ethnicity
Run large scale analyses quickly • Based on PLINK from Shaun Purcell (Broad Institute) • Modified algorithms to run on high performance cluster: • Basic association tests - 2 • Basic model based calculations - CA trend tests • Basic epistasis calculations - Pair-wise • Basic test for association with non-genetic factor • - Cochran Mantel Haenszel
‘One click’ annotation of SNPs • Right click on ‘SNP’ provides menu of further biological annotations • Automatic capture for future review – with option to add comments
Future Development Options • Usability Testing underway • Possible options based on feedback so far: • Offer support for full WTCCC Sanger QC Protocol • Provide LD Heat Map Plots (drawn by R) • Offer a standalone annotation capture tool • Expand analysis options e.g. IMPUTE • Your ideas??
Thanks & Download link • Microsoft External Research • OMII-UK & myGrid - workflows • Our clinical collaborators: • Prof. Adnan Custovic, Dr. Angela Simpson – • University Hospital of South Manchester • Dr. John New, Dr. Martin Gibson – • Salford Royal NHS Foundation Trust • Download from: www.nibhi.org.uk/sharedgenomics • Any Questions