250 likes | 362 Views
GATCACTGGCATGCATCGATCGACTGACTGCGGCATGCGCG ATCGACTGGCGATCAAACAGTCACGCGCATCGATCGACTGA GATCGCGGCATCGCGACGCGGATAAATACGAGCACTACAAA TGACTACGGGATTTTACGCGCGATACGACTGACTGACTAGC GATCACTGGCATGCATCGATCGACTGACTGCGGCATGCGCG ATCGACTGGCGATCAAACAGTCACGCGCATCGATCGACTGA
E N D
GATCACTGGCATGCATCGATCGACTGACTGCGGCATGCGCG ATCGACTGGCGATCAAACAGTCACGCGCATCGATCGACTGA GATCGCGGCATCGCGACGCGGATAAATACGAGCACTACAAA TGACTACGGGATTTTACGCGCGATACGACTGACTGACTAGC GATCACTGGCATGCATCGATCGACTGACTGCGGCATGCGCG ATCGACTGGCGATCAAACAGTCACGCGCATCGATCGACTGA GATCGCGGCATCGCGACGCGGATAAATACGAGCACTACAAA TGACTACGGGATTTTACGCGCGATACGACTGACTGACTAGC TGACGATCGAGAGACTCG01010001010101000101010101001 0010101010100000011110101111101001010101000111011101 0111101101010111001010101000111010101001100101110101 0111010010001010100011110101000010101010100010100011 0010101000101011110101000100100100101010001000001011 0010101010100000011110101111101001010101000111011101 0111101101010111001010101000111010101001100101110101 0111010010001010100011110101000010101010100010100011 0010101000101011110101000100100100101010001000001011 0101001010000101111101001010010101011101010010101001 LINCS Fall Consortia Meeting Broad Institute U54 Team Todd Golub, co-PI Wendy Winckler, co-PI Aravind Subramanian, Team Leader October 27, 2011
BASIC DISCOVERIES THERAPEUTIC IMPACT CONNECTIONS PATHWAYS DISEASE STATES TOOL COMPOUNDS GENETIC GWAS TCGA RNAi DRUGS DIAG- NOSTICS CHEMICAL SCREENS NAT’L PRODUCTS SLOW (SOME NEVER START) DOES NOT SCALE NO LEVERAGE
LINCS as a Solution • perturbations scalable to genome • high information content read-outs (e.g. gene expression) • inexpensive • mechanism to query database
Toward a reduced representation of the transcriptome gene expression is correlated samples genes
Reduced Representation of Transcriptome reduced representation transcriptome ‘landmarks’ computational inference model genome-wide expression profile 80% ~ 100,000 profiles % connections 1000 A. Subramanian, R. Narayan number of landmarks measured
Luminex Beads (500 colors, 2 genes/color) 001 1000-plex Luminex bead profiling 5' AAAA 3' RT 3' 5' 5'-PO4 | 3' TTTT ligation 5' 5' PCR hybridization Reagent cost: $3/sample
Validation of L1000 approach Gene-level validation 92% R2 > 0.6 Similar to AFFX vs ILMN 1000-plex-Luminex Affymetrix
Putting it all together Illustration: Bang Wong
Cell Types GTEx Primary hTERT-immortalized cells Patient-derived iPS cells* Banked primary cells* (T-cells, macrophages, hepatocytes, myocytes, adipocytes) Cancer cell lines * in assay optimization
2-3 weeks 3-4 weeks 4-6weeks Reprogramming [Oct4, Sox2, Klf4, Myc] Cell Repository (e.g. Coriell) Neural Differentiation Astrocyte somatic cell isolation Oligo- dendrocyte Neural progenitors fibroblasts Neuron
Perturbagens Small-molecules (n=4,000) Genes (n=3,000)
Automated Quality Control Measures Overall failure rate ~ 8%
LINCS Proposal (~ 600,000 profiles) • 4,000 compounds • 1,300 off-patent FDA-approved drugs • 700 bioactive tool compounds • 2,000 screening hits (MLPCN + others) • 2,000 genes (shRNA + cDNA) • known targets of FDA-approved drugs (n=150) • drug-target pathway members (n=750) • candidate disease genes (n=600) • community nominations (n=500) • 20 cell lines • emphasis on reproducibility and availability • cancer and primary, non-cancer • some ‘doubling down’ to assess intra-lineage diversity
Progress to date http://www.broadinstitute.org/lincs_beta/ proposed actual DATA RELEASE (BETA) projected
Signature of p53 ORF p53 vs. empty vector • p53 is NOT a Landmark Gene • p53 pathway is #1 pathway of 512 in MSigDB P < 0.001 Ramnik Xavier
Making connections in primary macrophages NF-kB pathway genes (all INFERRED) pathway rank: 1/512 LPS pathways curated from literature (n=512) Jens Lohr
Prioritizing human genetics candidates Ramnik Xavier, MGH
Signatures of genetic variants connect to disease genesets Ramnik Xavier, MGH
Disease variants connect to pathways e.g. CD40 to ATG16L1 (both regulators of autophagy) Ramnik Xavier, MGH
ERG transcription factor important in hematopoietic stem cells, prostate cancer ERG-binding small-molecules
Defining a gene expression signature of ERG activity integrating experimental and clinical data Gain of Function: Primary prostate + hTERT +ST +AR +/-ERG • Loss of Function: VCaP cells +/- ERG shRNA 120 • Patient Samples: Physician’s Health Study
L1000 as primary small-molecule screen read-out 12,985 compounds screened for ERG signature
Analytical and software challenges • Infrastructure: data and compute server • Optimization of connectivity metrics and statistics • Optimization of inference models (context-aware) • UI: query tools and results visualization • Addressing off-target effects of perturbagens
Aravind Subramanian Wendy Winckler Justin Lamb Computational Rajiv Narayan Josh Gould RNAi Platform Chemical Biology Platform Genetic Analysis Platform Broad Program Scientists Laboratory Dave Peck Willis Reed-Button XiaodongLu