580 likes | 641 Views
This guide offers an overview of microarray technology, gene expression studies, and software tools for data analysis in molecular biology. It covers the types of microarrays, their fabrication, and experimental processes. Learn about the central dogma of molecular biology, techniques for measuring RNA, DNA, proteins, and metabolites, and how microarrays have revolutionized biological research. Explore the uses of microarrays in gene expression analysis, comparative genomic hybridization, and more.
E N D
Outline • Introduction • MicroarraysTechnology • TypesandUsesofMicroarrays • MicroarraysfortheStudyofGeneExpression • Fabrication • Spottedmicroarrays • 2.Oligonucletidemicroarrays • ExperimentswithMicroarrays • Flowchartofaexperimentwithmicroarrays • SoftwareforMicroarrayDataAnalysis
Introduction(1) Briefreviewofmolecularbiology... Mostlifeformsaremadeofcells.Eachindividualhasaverylargeindefinitenumberofcells. Eachcell containschromosomes (e.g. human cells contain23pairs ofchromosomes). These organized structuresofDNAandinheritedinformation. proteins arethecarriersof AchromosomeisasinglepieceofcoiledDNAcontainingmanygenes,regulatoryelementsandothernucleotidesequences.
Introduction(2) Whatelse? ThegenomeofanorganismisinscribedinDNAorRNAinsomevirus AgeneisthebasicunitofheredityinalivingorganismandistheportionoftheDNAthatcodesforaproteinoranRNA Eachprotein-codinggeneisagenetranscribedintoRNAinsomemoleculesandinturnmRNAistranslatedintoatleastoneproteininsomecells
Introduction(3) TheCentralDogmaofMolecularBiology • Information flowfromDNAtoRNAto proteinoccursinfourstages Replication TheDNAreplicatesitsinformationinaprocessthatinvolvesmanyenzymes Transcription TheDNAcodesfortheproductionofmessengerRNA(mRNA) Splicing Ineucaryoticcells,themRNAisprocessedandmigratesfromthenucleustothecytoplasm. Translation MessengerRNAcarriescodedinformationtoribosomes.Theribosomesreadthisinformationanduseitforproteinsynthesis.
Introduction(4) TechniquesinMolecularBiology • MolecularbiologyhasdevelopedmultipletechniquestomeasurelevelsofRNA,DNA,proteinsormetabolites,suchas – – – – – SouthernBlotNorthernBlotDifferentialdisplaySAGE … • Post-genomicseraisperformandtoanalyze characterized byitscapabilityto data sets from large-scale experimentssimultaneously
Introduction(5) Theparadigmshift • Withthesameresourcesweobtainapicturewithlowerresolutionbutwithaviewofthewholecontext vs Basedon“Theparadigmshift”slidefromJ.Dopazo(CNIO)
Introduction(6) Todrawananalogywithpre-genomicsera • Biologyusedto“spy”ongeneseverythingindeepandindividually(i.e.genebygene)
Introduction(7) Todrawananalogywithpost-genomicsera Nowadays,alotofgenescanbe“spied”atthesametime...but... …Howcanwesplitthewheatfromthechaff?
MicroarraysTechnology(1) Broadlyspeaking... Microarraysareavarietyofplatforms inwhichhighdensityassaysperformedinparallelonasupport. aresolid PublicationsinPubMedwithmicroarraywordinthetitle 10911080 1000 1000986 988 Thistechnologyhaschangedtheway 920 biologistsapproach problemsandnewchallengesfor hugeeach 800 introducesstatisticiansquantityofexperiment numberofpublications 747 becauseofthedatageneratedin 600 544 400 Theyhavebeenusedforallkindsofbiologicalproblems 259 200 171 83 24 5 0 Theliteraturecontainsalmost8000papersusingmicroarraywordinthetitle 1998 2000 2002 2004 2006 2008 2010 year
MicroarraysTechnology(2) Thebiologicalprincipleofmicroarraysinvolvedin... • ItisthesameonethatallowsDNAdoublehelicesto providethebasisforheredity • SequencesofDNAorRNAmoleculescontainingcomplementarybasepairshaveanaturaltendencytobindtogether. ...AAAAAGCTAGTCGATGCTAG... ...TTTTTCGATCAGCTACGATC... • IfweknowthemRNAsequence,wecanbuildaprobeforitusingthecomplementarysequence.
MicroarraysTechnology(3) But...Whatisamicroarray? Itconsistofalargeset(thousandstotenofthousands)ofspecificsequences(knownasprobesorfeatures)attachedinorder(array)tomicroscopicspotsonasolidsupport(nylonorsiliconglass,...). ... moleculesample1 moleculesample2 moleculesampler Aprobe(thatcanbeagene,aprotein,ametabolite,...)isusedtohybridizeamoleculeofanucleicacidsample(calledtarget)underhigh-stringencyconditions. probeprobeprobeprobe gene1gene2gene3gene4 1 2 3 4 probeprobeprobeprobe5678 spots probeprobeprobeprobe9101112 Probe-Target determinetherelative hybridizationis usedtoof abundance ... nucleicacidsequencesinthetargets(e.g. todeterminesequences,to detect variationsingenesequences,levels,genemapping,...). expression probek-3 probeprobeprobek-2k-1k Microarray
TypesandUsesofMicroarrays(1) Typesofmicroarrays Microarraysspatiallyarrangedonasolidsurfacearemostwidelyused.` Beadarraysarecreatedby • eitherimpregnatingbeadswithdifferentconcentrationsoffluorescentdye, • orsometypeofbarcodingtechnology. Thebeadsareaddressableandusedtobindingeventsthatoccurontheirsurface. identify specific
TypesandUsesofMicroarrays(2) UsesofMicroarrays(1) Expressionanalysis –TheprocessofmeasuringgeneexpressionviaRNA(orcDNAafterreversetranscription)iscalledexpressionanalysisorexpressionprofiling. Inthisexperimentstheexpressionlevelsofthousandsofgenesaresimultaneouslymonitoredtostudytheeffectsofcertaintreatments,diseases,anddevelopmentalstagesongeneexpression. – ComparativeGenomicHybridization – Comparativegenomichybridization(CGH)orChromosomalMicroarrayAnalysis(CMA)isusedfortheanalysisofcopynumberchanges(increasesordecreases)oftheimportantchromosomalfragmentsharboringgenesinvolvedindiseases. Mutationanalysis –AsinglebasedifferencebetweentwosequencesisknownasSingleNucleotidePolymorphism(SNP)anddetectingthemisknownasSNPdetection. WithgDNAthiskindofarraystrytodetectgenesthatmightdifferfromeachotherbyaslessasasinglenucleotidebase. –
TypesandUsesofMicroarrays(2) UsesofMicroarrays(2) ProteinArray TissueArray CGHArrays SNPArrayAffymetrix CNVArrayIllumina ExpressionArrays cDNANylonMembraneArray GeneChipAffymetrixArray cDNAAgilentArray
TypesandUsesofMicroarrays(3) ApplicationofMicroarrays Genediscovery Identificationofnewgenes,knowabouttheirfunctioningandexpressionlevelsunderdifferentconditions. Molecularclassificationofcomplexdiseases Toclassifythetypesofcanceronthebasisofthepatternsofgeneactivityinthetumorcells,todevelopmoreeffectivedrugs. Drugdiscovery Comparativeanalysisofthegenesfromadiseasedandanormalcellhelptheidentificationofthebiochemicalconstitutionoftheproteinssynthesizedbythediseasedgenes.Thisinformationcanbeusedtosynthesizedrugsthatcombatwiththeseproteinsandreducetheireffect. Toxicologicalresearch Microarraytechnologyprovidesarobustplatformfortheresearchoftheimpactoftoxinsonthecellsandtheirpassingontotheprogeny.
MicroarraysfortheStudyofGeneExpression(1) Whatisthegeneexpression? • Thegeneexpressionisthepresenceofthegeneproductsofagene,intheformofmRNA(orprotein),inacell • Toputitstraight:Sincecellscontainthesamegeneticinformation,whatmakesdifferentbraincellsfromheartcellsisthegeneexpression.
MicroarraysfortheStudyofGeneExpression(2) FindingDifferentiallyExpressedGenes(DEG) Tofindgenesthatdisplayalargedifferenceingeneexpressionbetweentwoconditionsandarehomogeneouswithinthem – Typicallystatisticaltests(t-test,Wilcoxontest)areused Iftherearemorethantwoconditions,orifconditionsarenested,theappropriatestatisticalmethodisANOVA pvaluesfromthesetestshavetobecorrectedformultipletesting
MicroarraysfortheStudyofGeneExpression(3) Exploratorydataanalysis(1) Tofindgroupsthatarenotdefinedyet(e.g.noveldiseasesubtypes)Methods – – – – fromthisfieldwerethefirsttobeusedformicroarraydata shouldbeusedonlyifnopriorknowledgeexiststhatcouldbeincorporated findpatternsinthedata,butanypatterns,whethertheyaremeaningfulornotinclude • • Clustering(hierarchicalandpartitioning)Projection(PCA,MDS) Alizadehetal.Nature403:503–511(2000)
MicroarraysfortheStudyofGeneExpression(4) Timeseries,partitioningclusteringandcorrelation • Usuallyusedtofindpatternsofco-expressedgenesThemeaningoftimeseriesisdifferentfor • Biologists:2-10timepoints • statisticians:>200timepoints • “Non-optimal”solution:touseclusteringmethodstofindsuchpatterns Notethattheyarebynomeansexhaustive,andthatnosignificancemeasurecanbeattachedtothem IncontrasttoEstimationofDistribuitonMethods(EDA),partitioningclustermethodsaremorepopular(e.g.K-meansorSelf-organizingmaps) Toseekgeneswhoseexpressionprofileissimilartothatofaparadigmaticgene,correlationscanbecalculated,andsortbythem.Thereisnoneedforclustering. Specialmethodsexistforperiodicchanges(⇒cellcycle),e.g.Fourieranalysis
MicroarraysfortheStudyofGeneExpression(5) Classification Wheninformationaboutgroupingofthesamplesisavailable,itcan(andshould)beusedtogetimprovedresults Groupingsmaybe: – – – – – – – – TreatmentandControlDiseaseandNormalDiseasestage1,2,3MutantandWildTypeGoodandPoorOutcome,Therapysuccessorfailure ... Onelearnscharacteristicpatternsfromatrainingsetandevaluatebypredictingclassesofatestset
MicroarraysfortheStudyofGeneExpression(6) SurvivalAnalysis Tofindpatternsthatareassociatedwithprolongedpatients’survivaltime Insteadoftreatingoutcomeasabinaryvariable,canbeused – – TheoverallsurvivaltimeorTheeventfreesurvivaltime ascontinuousvariables,andtrytoestimateitbyregression Sincetherisktosufferfromrelapseisdecreasingwithtime,linearregressionmodelsarealmostalwaysinappropriatespecializedmodelswouldbebetter – – CoxregressionRegressiontrees
MicroarraysfortheStudyofGeneExpression(7) Pharmacogenomics Tofindmolecularpredictorsthattellaboutprobablesuccess(orfailure)ofacertaintherapy.e.g. – – estrogenreceptorstatusfortamoxifen(antihormone)therapyHER2/NEUstatusforherceptintherapyinbreastcancer Onemayregardtreatmentoutcomeasadiscretevariableanduseclassificationmethods Sometimes,it’sconvenientnottowaitforthefinalendpoint(whichmaybeyearsaway),buttousesurrogatevariables,e.g. – – thedropofthebloodlevelofacertainproteinreductionintumorvolume
Fabrication Twomaintechnologies Therearemanytypesoftechnologies,butprinciplesarethesame ThemostusedarespottedarraysandInsituarrays Spottedarrays(akacDNAarraysorStanfordarrays) – PreviouslysynthesizedcDNAsoroligonucleotidesaredepositedonthechip Basedon“printing-like”technologies – Insituarrays(akaoligoarraysorAffyarrays) – – – ProbesaresynthesizeddirectlyonthechipBasedonphotolithographictechniques Affymetrixarraysarethebest-known...butnottheonlyone!
SpottedArrays(1) Fromthechipstotheimages ChipDesignandProduction SamplePreparation Hybridization ScanningandCapturingImages ImageAnalysis Quantification
SpottedArrays(2) Chipdesignandconstruction • Productionbeginswiththeselectionofthe"probes"tobeprintedonthearrayIngeneral:chosenfrom • GenBank(http://www.ncbi.nlm.nih.gov/) • dbEST(http://www.ncbi.nlm.nih.gov/UniGene/index.html) • cDNA’sareprintedonthearray • Eachspotcancontainuniquesequences • Printing”meansadheringsequencestothespots Amovieoftheprintingprocessisavailablehere
SpottedArrays(3) Samplepreparation RNAisextractedfromthesamples ThisRNAisconvertedtofluorescentlylabeledcDNAbyreversetranscriptioninpresenceoffluorescentlylabelednucleotideprecursors RNAfromeachsamplesare labelledfluorescentCy-5)to withdyes different(e.g.Cy-3, allowdirect comparison 4.Afterlabeling,theyaremixed andhybridizedsequencesonthe(probes) witharray
SpottedArrays(4) Hybridizationwithprobes Targetslabeledandcombined Amovieofthehybridizationprocessisavailablehere
SpottedArrays(5) Scanningandcapturingimage AfterhybridizationeachDNAspotisilluminatedandfluorescencemeasurestakenforeachdyeseparately Thesemeasurementswillbeused,aftertheappropriatequalitycontrols,todeterminetherelativeabundance,ofthesequenceofeachspecificgeneinthetwomRNAorDNAsamples
SpottedArrays(6) Imageanalysis(1) TIFFimagesareprocessedbyimageanalysisprograms – – – SPOT, GenePix ... toacquireintensityvaluesforeachspot Thesemeasureswillbeused,aftertheappropriatequalitycontrols,todeterminetherelativeabundance,ofthesequenceofeachspecificgeneinthetwomRNAorDNAsamples
SpottedArrays(7) Imageanalysis(2) StepsinImageProcessing Addressing:Estimatelocationofspotcenters Segmentation:Classifyeachspota foreground(signal)background(noise) ● ● 3.Informationextraction(quantification) Foreachspotonthearray,andeachdyeobtain Signalmeasurements(R,G) – gg Backgroundmeasurements(bgR,bgG) gg – – Qualityindicators
SpottedArrays(8) Quantification Genemeasuredmeasures expressionis ● fromintensityasthe relative (corrected) intensityofonedyevsthe(corrected)relativeintensityoftheother M=Rg,M Rg−bgRg = Corrected Gg Gg−bgGg Background correction ● maybeaccordingquality needed, ornot,array tothe
SpottedArrays(8) Overviewoftheprocess Amovieofthewholeprocessisavailablehere
InsituChips(1) Fromthechipstotheimages MainConcepts SynthesisofOligosontheChip SamplePreparation HybridizationProcess ScanningImages OutputImages QuantificationandExpressionMeasures
InsituChips(1) Mainconcepts(1) MoreadvanceddesignthanspottedcDNAarrays – – TheyareNOTbasedoncompetitivehybridization.Thatis,onechip,onesampleTheyareNOTaddedonthechipafterbeingsynthesizedinvitro Mainidea:Probesaresynthesizedinsitu(onthechip) Sequencesarebuiltuponthechipsurfacebysequentiallyelongatingagrowingchainwithasinglenucleotideusingphotolithography Chemicalyieldofthestepwiseelongationislimited – SequencescanNOTgrowtomorethan25merslength(oligo) – Need16-20different25mersequencestouniquelycharacterizeagene • • Probe=Individual25mersequence Probeset=Setof25merscorrespondingtoaparticulargene/EST
InsituChips(2) Mainconcepts(2) Affymetrix(http://www.affymetrix.com)istheleadercompanyofthesekindsofchips.TheycallthemGeneChips Eachgeneisrepresentedbyasetofshortsequences Someofthesechipscontainwholegenomes,thatis>50.000probesets Aprobeset(usuallydenotedprobeset)isusedtomeasurethemRNAlevelsofauniquegene Eachprobesetismadeupofmultipleprobecells – – withmillonsofcopiesofoneoligodecopiasdeunoligo(25bp)Organizedinprobepairswith • • aPerfectMatch(PM):matchperfectlywithapieceofagene aMismatch(MM):itisthesametoPMbutwiththecentralnucleotidechangebythecomplementary
InsituChips(3) Mainconcepts(1) MoreadvanceddesignthanspottedcDNAarrays – – TheyareNOTbasedoncompetitivehybridization.Thatis,onechip,onesampleTheyareNOTaddedonthechipafterbeingsynthesizedinvitro Mainidea:Probesaresynthesizedinsitu(onthechip) Sequencesarebuiltuponthechipsurfacebysequentiallyelongatingagrowingchainwithasinglenucleotideusingphotolithography Chemicalyieldofthestepwiseelongationislimited – SequencescanNOTgrowtomorethan25merslength(oligo) – Need16-20different25mersequencestouniquelycharacterizeagene • • Probe=Individual25mersequence Probeset=Setof25merscorrespondingtoaparticulargene/EST
InsituChips(4) GeneChip®expressionarraydesign
InsituChips(5) Onegene,oneprobeset Probesareselectedtobespecificoftherepresentedgene Themusthavegoodpropertiesofhybridization genesequence
InsituChips(6) Synthesisofoligosonthechip(1) GeneChip®probearraysaremanufacturedthroughauniqueandrobustprocess,acombinationofphotolithographyandcombinationalchemistry ImagecourtesyofAffymetrix
InsituChips(7) Synthesisofoligosonthechip(2) mask mask mask mask mask mask mask C A T C mask T T T A C GA TC AG A GeneChip ImagefromacourseofDanNettleton
InsituChips(8) Synthesisofoligosonthechip(3) Severalcopiesofasinglefeaturearedepositedineachcell ImagecourtesyofAffymetrix
InsituChips(9) Samplepreparation
InsituChips(8) Hybridizationprocess OncetheoligoshavebeensynthesizedhybridizationisperformedbyaddingmRNAfromthetissuetoanalyzeonthechip ImagecourtesyofAffymetrix
InsituChips(9) ScanningImages Scanningoftaggedandun-taggedprobesonanAffymetrixGeneChip®microarray ImagecourtesyofAffymetrix
InsituChips(10) OutputImage DatafromanexperimentshowingtheexpressionofthousandsofgenesonasingleGeneChip®probearray ImagecourtesyofAffymetrix
InsituChips(11) Quantification Intensitiesfromeachelementareextracted QuantitativeanalysisofthehybridizationresultsisperformedbyanalyzingthehybridizationpatternofthesetofPMandMMprobesofeverygene Incontrastwithspottedchipsexpressionmeasuresusedhereareabsoluteones.Thatis,eachchipishybridizedwithonlyonetissueatatime
InsituChips(12) Absoluteexpressionmeasures MeasurestodeterminethequantitativeRNAabundance,i.e.theexpressionlevelbasedontheaverageofthedifferencesPMminusMMforeachprobefamily Avg.Diff=1¿j∈APM−MM ∣A∣ Manyalternativeshavebeenintroduced
SpottedvsInsituArrays PRO'sandCON's cDNAmicroarraysOligomicroarrays PRO's PRO's • • • • • • Cheaper Flexibilitywiththeexperimentaldesign Highsignalintensity(largesequences) Quickmanufacture(automated)Highreproducibility Highspecificity Alotofprobes/genes • CON's CON's • Requiresmorespecializedequipment ExpensivesLowflexibility • Lowreproducibility • Cross-hybridization(lowspecificity) • Highmanupulation(ssibilityofcontamination) • •
InsituChips(13) Overviewoftheprocess Amovieofthewholeprocessisavailablehere ImagecourtesyofAffymetrix