180 likes | 290 Views
NETWORK ANALYSIS: A FOCUS ON GENETIC NETWORKS. Pasquale Lucio Scandizzo, Alessandra Imperiali. Paper prepared for presentation at the 16 th ICABR Conference – 128 th EAAE Seminar “ The Political Economy of the Bioeconomy: Biotechnology and Biofuel ” Ravello, Italy, June 24-27, 2012.
E N D
NETWORK ANALYSIS:A FOCUS ON GENETIC NETWORKS Pasquale Lucio Scandizzo, Alessandra Imperiali Paper prepared for presentation at the 16th ICABR Conference – 128th EAAE Seminar “The Political Economy of the Bioeconomy: Biotechnology and Biofuel” Ravello, Italy, June 24-27, 2012
Outline • Introduction • Literature and review • Data and variables • Empirical analysis • Concluding remarks
Introduction Aim of thispaper • Investigate the characteristics of genetic networks. • Account for the role of desirable traits for modern agriculture. • Explore the implication of a model, where the desirable traits depend not only on the properties of the individual genes, but also on their connections and the architecture of the network
Introduction KeyQuestions • Isbiotechnologyresearch in agriculturemakingsubstantial progress along new lines? • Are there new paradigms for research in the sector? • Whatis the role of network theory? • Doesit promise to yieldrevolutionaryresults?
Introduction Results’ overview • Part 1 • Consistent with empiricalresearch on large networks, ourfindingspoint to a scale-free relationshipbetween the number of linksamonggenes and the number of genes, with a significant and proportional or more thanproportionalincreasein the percentage of links in response to a percentageincrease in nodes inside each co-expressionnetwork. • Thisrelationship can be interpretedas the result of informationexchanges, i.e. as a relationshipbetweenthe information containedand the information exchangedby the genes. • Part 2 • Ourfindingspoint to a positive, butlessthanproportional, scale free relationship, between the number of co-expressedgenes and the number of genes inside eachcoexpression network.
Literature and review An overview of a Rice coexpression network with 4,495 genes and 32,544 edges (Pearson’s correlation r ≥ 0.93) Source: A.Fukushima et al.,“Characterizing gene coexpression modules in Oryza sativa based on a graph-clustering approach”, 2009.
Literature and review From multiplicity of characters to co-expression In the past 15 years, an intense researchactivityhasbeendirectedtowardsbiological Networks, where the Network Theoryfindsitsnaturalapplication. Thesestudieshaveproducedremarkable progress in understanding the topological and chemicalstructures of the genes, and promise to makespectacularimprovements to agriculturalcrops. Todaygenescan be modified and recombinedinto the cells of living organismsto improvecropproductivity or to makecrops more resistant to stress, diseases and chemicaltreatments (Steven D. Tanksley and Susan R. McCouch, 1997). This new techniqueisknownasrecombinant DNA technology, and hasallowedscientists to carry out proceduresusinggenes and DNA that are extremelyadvanced and innovative.
Literature and review From multiplicity of characters to co-expression • RecombinantDNA (rDNA) consists of DNA sequencesresulting from laboratorymethodsthatbringtogethergeneticmaterial from multiple sources. • Scientistssucceded in isolatinggenesresponsible for mainadaptive and improvement traits and wereable to determinetheirchemicalstructure, together with theirfunction. • Thisacquiredknowledgewasthenused to develop the potential of our wild and cultivatedgerm-plasmresources for improvingagriculturalcrops. • Afterspendingdecades to disassemble nature, and havingprovided a wealth of knowledgeabout the individualcomponents, scientistsdeveloped a theory of complexitywherenothinghappens in isolation and most of the characteristics of living being derive from the interactionsamongtheirconstituents.
Literature and review From multiplicity of characters to co-expression Understandingand unraveling the interactions and the orchestratedactivity of manyinteractingcomponentsconstitutes a major goal for biologists of the genome era. The network approaches are used to integrate varioustypes of genomics data in order to increase the reliability of predictedinteractions. • Oneincreasinglyimportantmethodused to identifyinteracting gene sets isrepresented by the construction of gene co-expression networks where traits are the result of cooperative expression (co-expression) of genes, organizedaccording to the topology of networks. • Co-expression networks includesgenesinvolved in relatedbiologicalpathways, which are expressedcooperatively for theirfunctions. • Itisconstructed by determining the tendency of m transcripts to exhibitsimilarexpressionpatternsacross a set of nmicroarrays.(P. Ficklin and F. Alex Feltus, 2011)
Literature and review Gene co-expression network • In gene co-expression networks, each gene corresponds to a node. • Two genes are connected by an edge if their expression values are highly correlated. • Definition of “high” correlation is somewhat tricky • One can use statistical significance… • Alternatively, one can use a threshold parameter: scale free topology criterion.
Literature and review From multiplicity of characters to co-expression By exploring several large databases describing the topology of large networks, Albert and Barabasi (1999) found that the degree distribution follows a power-law for large k: Where K stands for the average degree of a node i, which represents the number of edges incident with the node. P stands for the probability that a node chosen uniformly at random has degree k. The value of the exponent varies between 2 and 3. Following this approach, the literature indicates that the intricate interwoven relationships that govern cellular functions follow a universal law. They are “scale-free, modular, hierarchical, small worlds of short paths and their connections are highly clustered” (Albert-Laszlo Barabasi and Zoltan N. Oltvai, 2004).
Literature and review Distribution of connections per node in the coexpressionnetwork Source: V. van Noort et al, “The yeast coexpression network has a small-world, scale-free architecture and can be explained by a simple model, 2004.”
Data and variables Meta Analysis • In the paperwediscuss the recombinant DNA techniques and theirapplication to agriculturalcrops, focusingourattention on the regulatorymechanism in gene interactions. • Weaim to study the interactionsunderlyingexpressed traits, usingthreecropsspecies: Arabidopsisthaliana, maize (Zeamays) and rice (Oryza sativa) • Weselected57 studiesaimed to identify the gene co-expression networks amongthesecropsby examining the co-expressionpatterns of genesover a large number of experimentalconditions. • Wecollectedboth the data and the resultspresented by the studies on 101 networks.
Empirical analysis Estimates Aim: Identify the correspondencebetween the underlyinggenes and the observed traits. • Weanalyzed the role of the number of genes on twodifferentcharacteristics: The number of edges(L) and the number of coexpressedgenes(C). • Using the data assembled from the studies, weestimatedtwodifferent relations by means of ordinaryleastsquares (OLS):
Empirical analysis Variablesused for ourestimates
Empirical analysis The Estimates
Empirical analysis The Estimates
Concluding remarks Conclusion • Our Analysis confirms the existence of a scale-free relationship which has been found ubiquitous in complex networks. • We found also that as the number of genes increase inside the biological networks considered, the number of co-expressed genes increase, less than proportionally. • This finding suggests that a strategy of research aimed to identify relevant clusters of co-expression may be more successful than one aimed at identifying single traits or groups of traits and corresponding gene determinants. • Althout this conclusion seems to hold for stress response, metabolic pathways and biosynthesis, which have a lower influence on the number of edges, the intensity of seed development appears to be associated instead with an increase in the connectivity of the network.