270 likes | 406 Views
CIS 4930/6930 – Recent Advances in Bioinformatics Spring 2014. Network construction from RNAi data Tamer Kahveci. Signaling Networks. MAPK network. Signal reachability. Luciferase. Reporter. Receptor. Signaling and RNA Interference. Luciferase. X. Not critical. X. Reporter.
E N D
CIS 4930/6930 – Recent Advances in BioinformaticsSpring 2014 Network construction from RNAi data Tamer Kahveci
Signaling Networks MAPK network
Signal reachability Luciferase Reporter Receptor
Signaling and RNA Interference Luciferase X Not critical X Reporter Receptor Critical
Signaling Network Reconstruction from RNAi data Reporter Receptor Not critical Critical
RNAi data and Reference Network Reference network Insert Reporter Receptor Not consistent ! Consistent ! Delete Not critical Critical
Overview Goal: Minimize the number of edit operations to make the reference consistent. NP-Complete ! Reference network GR = (VR, ER) Target network Find GT = (VT, ET) Given Constraints 1 0 0 1 0 1 0 SiNeC (Signal Network Constructor) S-SiNeC (Scalable Signal Network Constructor)
SiNeC algorithm Three steps • Order the critical genes left to right based on the topology of GR. [Sloan, 1986] • v1, v2, …, vc • Edge deletion phase • Edge insertion phase
Step 1: Order critical genes 2 3 Receptor Reporter 1 Prioritize based on distance to the reporter + degree
Step 2: Edge deletion Purpose: Eliminate detours around critical genes Bypassed !!! Receptor Reporter vk vi vj • Find all (undesirable) paths between non-consecutive critical genes. • i.e., Paths which go through only noncritical genes • Edges are weighted with the number of such paths they belong to. • Remove greedily starting from the largest weight until al paths are disrupted.
Step 3: Edge insertion Purpose: Make sure that critical are connected + noncritical genes are consistent Receptor Reporter vi+1 vi-1 vi • Insert an edge from vi-1 to vi if • There is no path from vi-1 to vi. • There is a noncritical gene on all paths from vi-1 to vi.
Overview Reference network GR = (VR, ER) Target network Find GT = (VT, ET) Given Constraints Finding all the paths can be too time consuming for large networks 1 0 0 1 0 1 0 SiNeC (Signal Network Constructor) S-SiNeC (Scalable Signal Network Constructor)
S-SiNeCalgorithm Reference network Right reachable Left reachable vs vt Critical vi Edge insertion Edge deletion
S-SiNeC: Edge insertion (A1) Purpose: Make sure that noncriticalgenes are consistent Reference network vs vt vi L R
S-SiNeC: Edge insertion (A2) Purpose: Make sure that criticalgenes are left reachable Reference network vs vt vi R L
S-SiNeC: Edge insertion (A3) Purpose: Make sure that criticalgenes are right reachable Reference network vs vt vi R L
S-SiNeC: Edge deletion (A4) Purpose: Make sure that no detours exist around critical genes Solve minimum cut between L & R Reference network vs vt vi R L
Dataset • Reference networks are obtained by random edge shuffling at 5% to 40% mutation rates. • 200 references per target network & per mutation rate.
Accuracy based on edge class Hot vs vt Cold
Running time results SiNeC > 1 hour per reference network.
Last Remarks • Constructing very large signaling networks from RNAi data is possible in practical running time. • Both SiNeC and S-SiNeC are robust to errors in reference network. • We recommend • S-SiNeC for very large OR dense networks. • SiNeC otherwise.
Acknowledgements CCF - 0829867 IIS - 0845439 260429