Finding Modules in Networks with Non-modular Regions

Finding Modules in Networks with Non-modular Regions Sharon Bruckner, Bastian Kayser, Tim Conrad Freie Uni. Berlin

Whatarenetworkswith non-modular regions? Can everynetworkbefullypartitionedintodenseclusters? WeintroduceNCC networks. Modular region Transition region

Where do NCC networksoccur? • The networkisactuallyfullpartitionable, but containsnoise. • The networkstructureis not strictlymodular:Nodes • Overlaps • Outliers • Pathsandnodesconnectingmodules • Example: A protein-protein interactionnetwork

Formalizingthisnotionofmodularity Then: What‘swrongwithsimplytakingNewman Modularity? Answer: Treesandverysparsegraphshave high Newman Modularity Goal: A score thatquantifieshow modular an NCC networkis, analogoustoNewman‘sModularity. 0.75

Transition matricesandgaps • In a discreterandomwalk on a network, therandomwalkermovesateachsteptorandomlychosenneighbor. This isencoded in thetransitionmatrix. • Clustering algorithmsoftenrely on thegap in theeigenvaluespectrumofthetransitionmatrix. • The presenceofeigenvaluescloseto 1, followedby a gap, indicatethepresenceofmodules. • For NCC networks, thisis not enough. Thereforeweintroduce a transitionmatrix Pt, thatcomesfrom a continuousrandomwalk.

First try: thegap score • Compute . • Find thelargestdifferencebetweentwoconsecutiveeigenvalues. • Return as score. Exploittheconnectionbetweenthenumberofeigenvaluescloseto 1 ofthetransitionmatrixandthenumberofmodules.

Gap score: Analysis andresults • Sparse networks do not have a high gap score • Example: roadnetworks, with Newman modularity ~ 0.95 andgap score of ~ 0.002 • Drawback 1: Arbitrariness. Thereisnoone „true“ gap in thespectrum. • Drawback 2: cleargapexistenceofmodules, but: existenceofmodulescleargap Main Advantage: This is a global score, not dependent on a partition

Second try: metastability score • Motivated bytheconceptofmetastabilityofphysicalsystems. • A metastablepartitionof a networkinto a transitionregionTandmodules satisfies: So, the time spent inside the modules should be long, while the time spent inside the transition region should be short.

Metastability score: Analysis For a givennetworkand a givenpartition we can then define the metastability score as . Main Drawback: This is a score for a givenpartition, not global! • The score depends on numberofmodulesm. • Cannotbeoptimized: Foreverypartitionwherewe will have and therefore an optimal score. Main Advantage: Explicitlytakesintoaccountthetransitionregion, moresuitedfor NCC networks.

Experiment results on thetwoscores Fornetworkswithtwomodulesofsize 100 withincreasingdensityand a transitionregionof 1000 nodes.

Intermediate Conclusion • Bothscoreshavesomemajordisadvantages • Currentlydeveloping an improved score: • Takes thetransitionregionintoaccount • isprovably „good“, at least on some well-definednetworkclasses. • dependent on thepartition (so, not global), but thepartitioncanbeoptimized.

Algorithmsforidentifyingmodules in NCC networks Wecomparethebehaviorof 3 algorithms on a benchmarkdatasetof NCC networks: MSM: The Markov State Model clusteringalgorithmfirstidentifiesandremovesthetransitionregion, andthendeterministicallyclusterstheremainingnodes. SCAN: Clusters nodestogetherbased on neighborhoodsimilarityandreachability. Assignsnodestherolesofhubs, outliers. MCL: The Markov Clustering algorithmsimulatesrandomwalks on thenetworkandidentifiesmodulesasregionswheretherandomwalkerstaysfor a long time. Returns a fullpartition.

Adjustingthealgorithms • For SCAN, outliersandhubsareadditionallyassignedtothetransitionregion. Main Adjustment: Nodes in modulesunder a threshold 1% ofnodesareassignedtothetransitionregion.

Howcanweevaluatetheresults? Benchmark networks: • A parameterizedrandomgraphmodelwhere: • modules: ER graphshavingthe same sizeandconnectionprobability • transitionregionis an ER graphwith • The nodes in M andTarethenconnectedw.p • VaryratioofsizeofMtoT, densityofMtoT. • Networks with 1000 nodes, 5 modules.

Howcanweevaluatetheresults? Evaluation Scores: • Comparingthe „groundtruth“ partitionfromtheconstructionofthebenchmarksetwiththepartitionfoundbythealgorithm . • Construct 3 scoresbased on the well-known Rand Index • evaluateshowwellthealgorithms separates thetransitionregionfromthe modular region • measuresthequalityoftheclusteringwithinthe modular region • a combinedscore.

Experiment 1: varyingsizes Comparing score for SCAN, MCL and MSM on networkswithvaryingsizesofT MSM performsbest Metastability score behavessimiliarlytoalgorithm score

Experiment 2: varyingdensitiesofmodulesandtransitionregion Plotting for the 3 algorithms with different combinations of and , along with the gap score

Experiment 3: A PPI network Weidentifiedmodules in theyeast FYI network. Ourmodulescorrespondtoknownproteincomplexesfromthe CYC2008 database. Weareworking on assigningrolestothenodes in thetransitionregion.

Summary and Outlook • Clustering networks such that not all nodesareassignedtomodulesisuseful. • Wepresentedtwoscorestoquantifyhow modular a networkis, andshowedthatthereisroomforimprovement. • Wecomparedtheperformanceof 3 algorithms on thetaskofidentifyingmodules in NCC networks.

Finding Modules in Networks with Non-modular Regions

Finding Modules in Networks with Non-modular Regions

Presentation Transcript

Finding Optimal Bayesian Networks with Greedy Search

Non-Cooperative Behavior in Wireless Networks

SI 614 Finding communities in networks

Finding Skyline Nodes in Large Networks

Zigbee Networks Using Xbee Modules

Modular Verification of Linearizability with Non-Fixed Linearization Points

Finding Effectors in Social Networks

Finding regulatory modules: A statistical approach

Inferring gene regulatory networks with non-stationary dynamic Bayesian networks

Locomotion in modular robots using the Roombots Modules

Finding the Area of Shaded Regions

Non-Cooperative Behavior in Wireless Networks

Finding patterns in large, real networks

Finding regulatory modules from local alignment

Finding Motifs in Promoter Regions

Finding Protection Cycles in DWDM Networks

Finding Things: Image Parsing with Regions and Per-Exemplar Detectors

Another Modular Focal Plane: Part 1 – Sub-modules

Non-Ethernet Networks

Modular Neural Networks II

Finding And Buying Transceiver Modules

Modular Kitchen In Chennai with Benefits