10 likes | 89 Views
eve Kr. id,x,y,z,Nx,Ny,Nz,Da,Db,Vn,Vc,eve,ftz,hb,kni,kr,rho,slp1,sna,tll,croc,fkh,twi,trn,gt ******
E N D
eve Kr id,x,y,z,Nx,Ny,Nz,Da,Db,Vn,Vc,eve,ftz,hb,kni,kr,rho,slp1,sna,tll,croc,fkh,twi,trn,gt ****** 1009,142.45,124.288,38.4751,-0.2286,0.37424,-0.89871,0,0,317.559,864.635,13.7058,81.2576,14.1843,21.9375,4.9696,7.9223,10.912,6.5325,18.3249,17.3787,18.499,10.9727,38.7896,80. 2018,159.302,123.878,164.709,-0.10815,0.3775,0.91967,0,0,308.755,865.85,70.6029,10.0973,25.2076,17.8543,8.8861,33.3451,13.8321,58.1229,17.7753,27.252,20.3735,70.5197,10.4731,2 3027,111.273,115.159,158.502,-0.23647,0.3794,0.8945,0,0,268.984,685.818,12.2596,5.7213,8.0693,69.3157,6.614,28.261,70.3689,58.5122,15.6545,22.3175,15.5599,63.354,45.7422,40.88 4036,341.931,30.2893,75.7972,0.18502,-0.94545,-0.26812,0,0,231.338,537.361,19.9463,53.7362,65.8337,21.3702,8.2404,19.4855,8.0923,5.6487,25.3556,18.5914,22.0013,10.0438,22.6887 5045,168.5,25.747,67.1179,-0.10967,-0.90025,-0.42134,0,0,224.052,630.261,51.517,8.0767,22.2996,23.0654,11.2735,46.5546,12.3596,5.6873,15.8975,24.598,25.5458,9.9865,16.8356,20. 6054,119.682,108.526,168.746,-0.19491,0.21631,0.95667,0,0,170.923,505.484,67.6004,6.7196,10.9492,24.8757,7.5595,58.568,66.6675,36.9671,23.4779,15.9494,18.4064,59.0062,17.4559, 1,175.229,155.285,55.1238,-0.14639,0.80211,-0.57896,0,0,285.682,582.293,10.2466,89.2554,23.3279,18.3,10.1307,16.224,13.789,5.0749,15.5772,21.7618,22,15.5339,34.8946,16.5044 ... 1 5 nuclei / µm2 nuclei / µm2 D. melanogaster stripe 1 D. pseudoobscura stripe 1 80 80 60 60 4-8% 9-25% 26-50% 51-75% 76-100% n = 126 n = 192 n = 390 n = 362 n = 289 40 40 Relative expression intensity % Relative expression intensity % 20 20 n = 54 n = 194 n = 174 n = 55 n = 77 4-8% 9-25% 26-50% 51-75% 76-100% 0 0 D V D D V D D. melanogaster stripe 5 D. pseudoobscura stripe 5 80 80 60 60 4-8% 9-25% 26-50% 51-75% 76-100% n = 126 n = 192 n = 390 n = 362 n = 289 40 40 Relative expression intensity % Relative expression intensity % 20 20 n = 54 n = 194 n = 174 n = 55 n = 77 4-8% 9-25% 26-50% 51-75% 76-100% 0 0 D V D D V D CRM-8493 corr=97.31% eve stripe 2 CRM-8198 corr=88.47% eve stripe 7 CRM-8331 corr=89.44% eve stripe 2+7 Predicted regulators Target pattern Output pattern Computational embryology can reveal subtle differences between species On computational analysis of quantitative, 3D spatial expression in Drosophila blastoderm With cellular resolution data, new morphological features and morphogenetic phenomena can be discovered, which will also lead to new insights in comparative embryology. D. melanogasterD. pseudoobscura % egg length % egg length Analysis of Drosophila melanogaster blastoderm (left) shows significant anterior-posterior (AP) and dorsal-ventral (DV) nuclear density differences. Similar analyses on D. pseudoobscura blastoderm (right) show analogous but different density patterns. The density differences between the species are likely to depend on subtle quantitative differences in the expression patterns of developmental regulators and in the cellular responsiveness to developmental signaling between the species. Soile V. E. Keränen1, Angela DePace2, Ann Hammonds1, Bill Fisher1, Oliver Rübel1, Gunther Weber1, Clara Henriquez1, Charless Fowlkes3, Cris L. Luengo Hendriks4, Lisa Simirenko5, E. Wes Bethel1, Hans Hagen6, Bernd Hamann7, Jitendra Malik6, Susan E. Celniker1, David W. Knowles1, Michael B. Eisen5, Mark D. Biggin1 1) Lawrence Berkeley National Laboratory, Berkeley, CA, USA; 2) Harvard Medical School, Boston, MA, USA; 3) UC Riverside, Riverside, CA, USA ; 4) Uppsala University, Uppsala, Sweden; 5) UC Berkeley, Berkeley, CA, USA; 6) University of Kaiserslautern, Kaiserslautern, Germany; 7) UC Davis, Davis, CA, USA eve Kr 1 5 3D cellular resolution quantitation of spatial gene expression stage 5:0-3% t5:4-8% t5:9-25% t5:26-50% t5:51-75% t5:76-100%. The development of species specific morphologies results from complex, quantitative action of gene expression networks. The analysis of such networks requires computationally analyzable, cellular resolution datasets of spatial gene expression. The methods to construct them should be as robust and automated as possible to increase the speed of analysis and to decrease the error. For Drosophila blastoderm stage embryos, we now have data for the protein and mRNA expression of over 100 genes, 7 patterning mutant strains, 23 transgenic promoter constructs, and 3 Drosophila species. BDTNP data (estimated) >6400 PointClouds for stages 4 and 5 ~3.6 Tb raw image data ~8.7 Gb stage 4 and 5 PointCloud data >130 genes including: Drosophila melanogaster wild type + 7 mutants D. pseudoobscura and D. yakuba wild type 23 CRM-lines in D. melanogaster background (http://bdtnp.lbl.gov/) 300 Mb image stack Nuclear segmentation Different views of 6 genes in a D. melanogaster VirtualEmbryo displayed in PointCloudXplore (see Rübel et al., 2006). Top, a 3D embryo view. Center, a cylindrical projection. Bottom, a cylindrical projection with a height map showing gene expression levels. 1 i 5 4 2 gt 3 1 Mb text file for computational analyses The Berkeley Drosophila Transcription Network Project (BDTNP) is developing cellular resolution, quantitative expression maps of Drosophila embryos in a computationally analyzable format. Files representing data from one embryo are called PointClouds. Multiple PointClouds are then aligned to a common framework, termed a VirtualEmbryo, to allow modeling and simulation of multiple genes regulation in a 3D environment. The atlas data can be used for computational analyses, e.g., to make predictions on putative regulators of spatial gene expression domains (Fowlkes et al., 2008) 3D analyses of cis-regulatory output show subtle quantitative differences to the wild type pattern To analyze how sequence is read into expression pattern, the BDTNP are testing with expression constructs suspected or known cis-regulatory modules (CRMs), identified with sequence analyses. Visual examination of a 2D photographs shows that many of these putative CRMs partially recapitulate the wild type blastoderm patterns. To make cellular resolution comparisons between the expression output of a CRM-transgene and the wild type expression, we have measured the 3D expression patterns of a set of CRM-transgenes and aligned them to a VirtualEmbryo containing wild type gene expression patterns. We found that most of the studied CRM-transgenes have subtle or not so subtle quantitative differences in expression compared to intact genes. While discrepancies between CRMs and intact gene patterns have been noted in a few case before (e.g., Schroeder et al., 2004) more often the similarities have been emphasized. In comparison, our results suggest that quantitative and qualitative differences are so common that the current gene regulatory models based on CRM sequences need to be calibrated against actual experimental data. CRMs showing weak to moderate blastoderm expression are expressed at much higher levels in late embryos, often at multiple stages and cell types. Three example D. melanogaster CRMs with weak early pattern and strong late pattern in D. melanogaster embryos. We believe that many CRMs that drive early gene expression are multifunctional. This is likely to impose extra evolutionary constraints in the sequence and structure of the CRM-promoter combinations. It is also possible that the early expression patterns are largely non-functional and simply tolerated because of the later gene function. Such patterns would be free to acquire new, essential functions, thus imposing further constraints on the CRM. 8001 8086 8010 Computational analyses of gene expression show the qualitatively similar eve stripe 1 and eve stripe 5 have relative intensity differences between D. melanogaster and D. pseudoobscura. The error bars show 95% confidence intervals. We think that small quantitative expression differences are very common between even closely related species and that they play a significant role in evolution of species specific morphologies and in speciation in general. Cellular resolution data can also be used for rapidly measuring features, that are theoretically analyzable by hand counts, such as the total number of nuclei and egg sizes, in large numbers of embryos (left). With a sufficiently large sample size, we can show that nuclear numbers at stage 5 scale to the egg size in both Drosophila species, and that the distributions of nuclear numbers are different, though individual embryos may fall within the parameters of another species. Further basic measurements using large data sets are likely to reveal further developmental rules, or constraints to morphological evolution. Three example D. melanogaster CRM-transgenes with blastoderm patterns, 2D photographs vs 3D expression intensity maps: 8001 (above) accurately replicates gt posterior band; 8086 (right) has some similarity to kni pattern but with clear spatial and quantitative a/p and d/v differences; 8010 (far right) replicates odd stripes 3 and 6 but with differences in d/v intensity profile. (red CRM-LacZ, green wild type expression pattern) Future prospects Interactive displays in computational modeling and mining the PointCloud data: PointCloudXplore/Matlab - interface Uses of cellular resolution embryo atlases: Digital reference maps of gene expression and fine level anatomical detail from anatomy to systems genomics (“computational histology”) Novel form of computational biology resources for data mining and rapid multidimensional analyses of morphogenesis, gene expression and potential modes of microevolution and speciation Computational platforms for modeling and simulating gene regulatory network function and pattern formation in a realistic environment (“virtual embryos as in silico experimental model organisms”) We are currently: Analyzing the transcription factor binding in putative CRMs to combine ChIP-chip and ChIP-seq data to the spatial pattern data Expanding the expression pattern data set for wild type target gene mRNAs, wild type regulator proteins and CRM-LacZ reporter strains Expanding the methods for generating cellular resolution embryo atlases in later stage Drosophila embryos Developing approaches for computational analyses of spatial PointCloud data Future needs: PointCloud maps of multiple late embryonic stages and more species Dissemination of new computational tools and 3D spatial datasets to Drosophila community and developmental biologists in general Development of interactive ontology map for Drosophila embryo describing each cell’s organ and tissue identity, expression profile, and lineage More sophisticated algorithms and computational strategies for handling and analysis of spatial expression data and for integrating it with other forms of genomic data (sequence, microarray, protein-interaction, etc.) While the spatial patterns of regulator availability (network input) may have multiple implications on the function and evolution of regulatory sequences, a VirtualEmbryo represents the spatial expression profile of a blastoderm in tens of Mb large text file, which can be nonintuitive to an average biologist. To facilitate the use of these quantitative cellular resolution embryo atlases we have provided a visualization and analysis tool, PointCloudXplore, which can display the expression data both as an embryo and as an abstract view. To increase the flexibility of the PointCloudXplore, have now added to it a Matlab interface which can be used for exporting the PointCloud data for Matlab analyses, running the analysis scripts and/or importing the results back to the PointCloudXplore (Rübel et al, submitted). This enables versatile use of various Matlab functions for rapid development of new analysis and data plotting scripts without having to alter the PointCloudXplore itself. As an example for a newly added data mining capability, we chose the early and late hb expression patterns (above right) with PointCloudXplore/Matlab interface. An associated short Matlab-script then computed their difference, returned to the PointCloudXplore to be viewed as an artificial expression channel (below right). Sharing Matlab scripts for PointCloud data manipulation, for example, via curated databank would be of further advantage to the user community. The PointCloudXplore/Matlab-interface can be used also for more complicated simulation and analysis programs. We have used the PointCloudXplore/Matlab-interface to run a genetic algorithm to see if we can identify potential regulators of individual eve stripes. We found that while the simulations were quite successful in identifying the known regulators of eve stripe 2, the results from eve stripe 7 were less promising. We assume that this is due to insufficient gene regulatory model for our optimization algorithm. We also found that eve stripes 2 and 7 could theoretically be co-regulated, as also suggested by some earlier authors (Hare et al. 2008, Janssens et al. 2006). Since differences between abinitio modeling and biological experiments can be useful both for revealing weaknesses in theoretical models and for discovering new phenomena, we believe that cellular resolution computer atlases of embryos and organs and in silico experimentation with these will become an important tool in biology, parallel to in vitro and in vivo experiments. 1 2 3 The schematic view of PointCloudXplore/Matlab-interface: 1) The various expression views and the built-in data filtering capabilities in PointCloudXplore allow easy selection of genes of interest. 2) The exported data is analyzed using a Matlab script. 3) The output is returned to PointCloudXplore for easy visualization or further data filtering The input data (above righ) used for ab initio attempt to find of potential regulators of individual eve stripes, and the results (right). The predictions that agree with experimental data are shown in green, the predictions that disagree are shown in red and the results that neither agree nor disagree with experimental data are shown in black, as are the stripe 2+7 results. The stripe 7 results show that the output of a regulatory network can be mimicked by a set of regulators that is different from the experimentally verified one. References: Rübel et al., 2006 in Proc. Eurographics/IEEE-VTGC Symp. On Visualization Schroeder et al., 2004 PLoS Biol. 2(9):E271. Hare et al. 2008 PLoS Genet. 4(6):e1000106. Janssens et al. 2006 Nature Genet. 38(10):1159-65.