180 likes | 383 Views
Genes. Outline. Genes: definitions Molecular genetics - methodology Genome Content Molecular structure of mRNA-coding genes Genetics Gene regulation Genetics Molecular biology Arrays Issues Genetic and misexpression approaches. Gene Definitions. Gene
E N D
Outline • Genes: definitions • Molecular genetics - methodology • Genome Content • Molecular structure of mRNA-coding genes • Genetics • Gene regulation • Genetics • Molecular biology • Arrays • Issues • Genetic and misexpression approaches
Gene Definitions • Gene • Molecular definition: stretch of DNA that encodes: • Functional RNAs - tRNA, rRNA • Functional proteins - mRNA • All sequences necessary for proper function (genetic) – includes regulatory elements and transcription unit • Generally excludes other types of genomic sequences • Centromeres, telomeres, origins of DNA replication, transposons • Genetic definition: element required for proper organismal function
Molecular Genetics • Genetics [mutant phenotype] • Molecular Biology [gene: sequence,expression-arrays] • Biochemistry [activities, interactions] • Cell biology [structure, dynamics]
Genomic Content • Calf Thymus DNA sheared to a size of ~300 bp, denatured, and reannealed • 3 classes: • Highly repetitive – 10% DNA - anneals very rapidly • Middle repetitive – 30% DNA - C0t1/2 = 0.04 • Non-repetitive (unique) – 60% DNA - C0t1/2 = 4000
Drosophila virilis satellite DNAs > 95% each satellite consists of predominant sequence Highly Repetitive Simple Sequence DNA • Clusters of tandemly-linked 5-10 bp repeats • Can have > 106 copies/genome • Not transcribed
Intermediately Repetitive DNA – Mobile Elements • Repetitive elements interspersed among unique DNA • Most are transposons – mobile DNA • Many are no longer able to transpose • Dispersed throughout the genome • Different classes • Transpose as DNA or RNA intermediates Unique DNA Repeat
Unique DNA-Coding Sequence Genes • Slow kinetic class corresponds mainly to protein-coding genes • Average gene size (transcribed region only)/organism • E. coli 1.2 kb • Yeast 1.7 kb • Drosophila 11.3 kb • Human 27.0 kb • As complexity increases, so does gene size
Overview of Gene Expression-1 Regulatory region Transcription unit DNA > ACGT RNA > ACGU Nucleus Transport to cytoplasm
Overview of Gene Expression-2 aa1 = methionine Protein - myoglobin
DNA and Clones • Genomic or chromosomal DNA – genomic clones (exons, introns, spacer, etc.) • Transcription unit – entire region of gene transcribed (exons + introns) • mRNA – cDNA clones (exonic sequences) • ESTs – expressed sequence tags • Oligonucleotides – small stretches of DNA (~20-50 nt)
Human Genome Project: Gene Number • Size 3,200 Mb • Predicted gene number • Celera – 39,114 • Public consortium – 29,691 • Refseq (known genes) – 11,015 • Non-identity • ~64% novel genes don’t overlap • > 80% novel genes expressed • Indicates they are real • Estimate ~50,000 genes • Estimate ~ 64 kb/gene • Transcribed region = 27 kb • Spacer DNA = 37 kb • Repeats + control elements • Human – large number of transcripts/gene exist because of alternative splicing
Completed Genomic Sequencing Projects • Human – disease genes • Drosophila – model system for animal development and gene control • Strength - genetics • Nematode - model system for development and behavior • Strength - genetics • Fly and human more related than worm-human • Arabidopsis – weed: model plant genetic system • Crop plants – rice, maize • Yeast – typical eukaryotic cell • E. coli • Many pathogenic bacteria - disease
Genome Projects in Progress • Multiple Humans – SNPs : disease genes and predispositions • Mouse – model system to study human/mammalian gene function • Strength – knockout mutants • Zebrafish - model vertebrate genetic system • Strength – large-scale genetic screens • Crop plants – poplar, apple, tomato + pests • Additional Drosophila species • Identify gene control regions
Drosophila • Drosophila genome = 180 Mb • Sequenced 120 Mb euchromatic region • 60 Mb heterochromatic region unsequenced (few genes) • Annotation – 13,601 predicted genes • Genie – predicts ORFs/exons • Compare to Expressed Sequence Tags (ESTs-cDNAs) • Blast searches – sequence identity to known genes
Complications of Gene Prediction by Computer: Cranky Example • RT-PCR of embryonic RNA • 33 kb EST-LP05454-3' Exons EST-LP05454-5' 6 7 9 8 3 5 4 2 1 CG14554 Genie CG12561 1-3 CG14552 1-4 CG14553 1-3
Drosophila Gene Functions • 14,113 predicted transcripts with different coding sequences • Biochemical functionProcess • 2,081 Transcription factors 2,274 Metabolism • 2,422 Enzymes 530 Cell communication • 665 Transporters 486 Development • 622 Signal transduction 201 Physiology • 303 Structural proteins 118 Sensation & behavior • 216 Cell adhesion 8,884 Unknown • 7,576 Unknown