500 likes | 611 Views
Ch2. Genome Organization and Evolution (continue). 阮雪芬 Jan02, 2003 NTUST. Pick out Genes in Genomes. Open reading frames (ORFs) Start codon ------------------ stop codon A potential protein-coding region Approaches to identify protein-coding regions
E N D
Ch2. Genome Organization and Evolution (continue) 阮雪芬 Jan02, 2003 NTUST
Pick out Genes in Genomes • Open reading frames (ORFs) • Startcodon------------------stop codon • A potential protein-coding region • Approaches to identify protein-coding regions • Detection of regions similar to known coding regions from other organisms • Ab inition methods • It is more complete and accurate for bacteria than eukaryotes
Pick out Genes in Genomes • A framework for ab initio gene identification in eukaryotic genomes
Genomes of Prokaryotes • Most prokaryotic cells contain • A large single circular piece of double-stranded DNA (< 5 Mb) • Plasmids • E. coli only ~11% of the DNA is non-coding.
The Genome of the Bacterium E. coli 大腸桿菌 • Strain K-12 contains 4639221 bp in a single circular DNA molecules, with no plastids. • An inventory reveals • 4285 protein-coding genes • 122 structural RNA genes • Non-coding repeat sequences • Regulatory elements • Transcription/translation guides • Transposase • Prophage remnants • Insertion sequence elements • Patches of unusual composition
The Genome of the Bacterium E. coli • The average size of an ORF is 317 amino acids. • 630-700 operons, operons vary in size, although few contain more than five genes. Genes within operons vary to have related functions.
The Genome of the Bacterium E. coli • Several features of E. coli • It can synthesize all components of proteins and nucleic acids, and cofactors. • It has metabolic flexibility • A wide range of transporters • Even for specific metabolic reactions there are many cases of multiple enzymes. • Does not posses a complete range of enzymatic capacity.
The genome of the archaeon Methanococcus jannnaschii 古甲烷球菌 • Methanococcus jannnaschii was collected from a hydrothermal vent 2600m deep off the coast of Baja California, Mexico, in 1983. • Thermophilic organism • The genome was sequenced in 1996 by The Institute for Genomic Research (TIGR). It was the first archaeal genome sequenced.
The genome of the archaeon Methanococcus jannnaschii • It contains a large chromosome containing a circular double-stranded DNA molecule 1664976 bp long. • 1743 predicted coding regions. • Some RNA genes contain introns. • As in other prokaryotic genomes there is a little non-coding DNA. • In archaea, protein involved in transcription, translation, and regulation are more similar to those of eukaryotes. • Archaeal proteins involved in metabolism are more similar to those of bacteria.
The genome of one of the simplest organisms: Mycoplasma genitalium 黴漿菌 • An infectious bacterium. • Its genome was sequenced in 1995 by TIGR, The Johns Hopkins University and The University of North Carolina. • The gene repertoire includes some that encode proteins • DNA replication • Transcription • Translation • Adhesions • Other molecules for defence against the host’s immune system. • Transport proteins
Genomes of Eukaryotes • In eukaryotic cells, the majority of DNA is in the nucleus, separated into bundles of nucleoproteins, the chromosomes. • Each chromosome contains a singledouble-stranded DNA molecule. • Nuclear genomes of different species vary widely in size. • Eukaryotic species vary in the number of chromosomes and distribution of genes among them. • Human chromosome 2~~a fusion of chimpanzee chromosomes 12 and 13.
Genomes of Eukaryotes • Saccaromyces cerevisiae (Ibaker’s yeast) • Protein-protein interaction • Yeast two-hybrid system
Yeast Two-hybrid System • Useful in the study of various interactions • The technology was originally developed during the late 1980's in the laboratory Dr. Stanley Fields (see Fields and Song, 1989, Nature).
Yeast Two-hybrid System GAL4 DNA-activation domain GAL4 DNA-binding domain Nature, 2000
Yeast Two-hybrid System • Library-based yeast two-hybrid screening method Nature, 2000
Protein-protein Interactions on the Web • Yeast http://depts.washington.edu/sfields/yplm/data/index.html http://portal.curagen.com http://mips.gsf.de/proj/yeast/CYGD/interaction/ http://www.pnas.org/cgi/content/full/97/3/1143/DC1 http://dip.doe-mbi.ucla.edu/ http://genome.c.kanazawa-u.ac.jp/Y2H • C. Elegans http://cancerbiology.dfci.harvard.edu/cancerbiology/ResLabs/Vidal/ • H. Pylori http://pim/hybrigenics.com • Drosophila http://gifts.univ-mrs.fr/FlyNets/Flynets_home_page.html
Yeast Protein Linkage Map Data • New protein-protein interactions in yeast List of interactions with links to YPD Stanley Fields Lab http://depts.washington.edu/sfields/yplm/data
Genomes of Eukaryotes • Caenorhabditis elegans • The genome was completed in 1998 • The first full DNA sequence of a multicellular organism • XX genotype: a self-fertilizing hermaphrodite. • XO genotype: a male.
Genomes of Eukaryotes • Drosophila melanogaster • Its genome sequence was announced in 1999 by a collaboration between Celera Genomics and the Berkeley Drosophila Genome Project. • Despite the fact that insects are not very closely related to mammals, the fly genome is useful in the study of human disease. • It contains homolgues of 289 human genes implicated in various disease: • Cancer • Cardiovascular disease….etc.
Genomes of Eukaryotes • Arabidopsis thaliana • A flowering plant • ~125 Mbp DNA
Genomes of Eukaryotes-Human • In Feb 2001, the International Human Genome Sequencing Consortium and Celera Genomics published, separately, drafts of the human genome. • 22 chromosome pairs +X, Y • Protein coding gene • ~32000 genes in all
Nucleic acid binding Transcription factor binding Cell cycle regulator Chaperone Motor Actin binding Defense/immunity protein Enzyme Enzyme activator Enzyme inhibitor Apoptosis Signal transduction Storage protein Cell adhesion Structural protein Transporter Ligand binding or carrier Tumour suppressor Unclassified Genomes of Eukaryotes-Human • Human protein coding gene
Genomes of Eukaryotes-Human • Repeat sequences • 50% of the genome • Contain • Transposable elements • Retroposed pseudogenes • Simple “sutters” • Segmental duplications • Blocks of tandem repeats
Genomes of Eukaryotes-Human • RNA • 497 transfer RNA genes • Genes for 28S and 5.8S ribosomal RNAs • Small nucleolar RNAs • Spliceosomal snRNAs
SNPs • Single-nucleotide polymorphisms (SNPs) • A genetic variation between individuals, limited to a single base pair which can be substituted, inserted or deleted. • Sickle-cell anaemia is an example of a disease caused by a specific SNP • AT mutation in the beta-globin gene changes a GluVal
SNPs • Single-nucleotide polymorphisms (SNPs) • Nearly 1.8 million SNPs • Occurring on the average every 2000 base pairs. • Not all SNPs are linked to disease • The A, B, and O alleles of genes for blood groups illustrate these possibilities. • A and B alleles differ by four SNP substitutions.
ABO Blood Groups N-acetylgalactosamine Galactose The human ABO blood groups illustrate the effect of glycosyl-transferases.
Evolution of Genomes • Synonymous nucleotide substitution • Non- synonymous nucleotide substitution Ka = the number of non- synonymous nucleotide substitution Ks = the number of synonymous nucleotide substitution Ka/ Ks : high ratio possibly functional changes
Example- The Effect of RGD Mimetic Peptide in Breast Cancer Cell Line MCF7
Introduction • RGD has been used as inhibitor of integrin-ligand interaction. • Loss of integrin-mediated signaling will induce apoptosis.
Introduction RGD(Arg-Gly-Asp) is the smallest motif that bind with the integrin receptor on the cell surface and Play important role in cell cycle. Aggregation Cell Death Control
Our Study Human breast cancer cell MCF-7 Genomic Study RGD mimetic peptides Proteomics Bioinformatics Cell Apotosis
Gly Asp Arg Trp Asp Tpa Pro Arg Gly Cys Cyclic-RGD The Structures of RGD Mimetic Peptides
RGD cRGD control control 1mM 0.5mM 5mM 1mM
cDNA Microarray C-RGD, 24hr C-RGD, 6hr C-RGD, 48hr C-RGD, 72hr
Apoptosis • Total 34 genes, but after filtering there are only 19 genes • Total 11genes have expression fold >2 (up or down changes)
Caspase Pathway in CRGD-treated MCF7 Cell Caspase 10 Caspase 3 Caspase 9 Caspase 8 and FADD Caspase 4 Caspase 7
Searching and Clustering of RGD-containing Protein in Swiss-Prot Database • In Swiss-Prot database, there are541 human RGD-containing protein containing 5 caspase proteins. • Caspase 8 was clustered with integrin beta4 • Caspase 1, caspase 2, caspase 3 and caspase7 are clustered.
Please pass the genes: horizontal gene transfer • Horizontal gene transfer is the acquisition of genetic material by one organism from the other. • Direct uptake • Via a viral carrier
Genome Databases • PIR
Genome Databases • Entrez Genomes
Exercises • Weblem 2.1 • Weblem 2.9 • Weblem 3.1 Deadline: Jan 16