160 likes | 293 Views
Clemson NextNet SDN Use Cases for Life Sciences Research Kuang -Ching “ KC ” Wang Associate Professor Clemson University Sponsored by NSF grant OCI‐1245936. Clemson NextNet: A NSF CC-NIE Project. Objectives: Direct access to I2 100G Innovation Platform
E N D
Clemson NextNet SDN Use Cases for Life Sciences Research Kuang-Ching “KC” Wang Associate Professor Clemson University Sponsored by NSF grant OCI‐1245936 July17 2013
Clemson NextNet: A NSF CC-NIE Project Objectives: • Direct access to I2 100G Innovation Platform • Science DMZ from anywhere, w/o manual plumbing • Campus production,end-to-end support • Flexible, optimized10~40G access to resources on campus and other universities • Software defined network (SDN) July 17 2013
What is the Fuss About SDN? Traditional Network SDN Network Researchers: Industry: Traditional network gettingingunmanageable (not about bandwidth)! July 17 2013
What Do Our (Life Sciences) Folks Need? Two Clemson life sciences researchers in attendance today: • Alex Feltus • Associate Professor in Genetics & Biochemistry • Faculty Consultant in Clemson University Genomics Institute • Research: Rapid crop design with massive gene interaction networks • David Kwartowitz • Assistant Professor in Bioengineering • Research: Rapid processing stereo laparoscopic data for real-time pre- and intra-surgery support Data Store N Palmetto HPC Cluster … Real-time medical imaging July 17 2013
The Feltus Lab Builds Massive Gene Interaction Networks Using RNA Expression Profiles From Next-Generation Sequence (NGS) and Microarray Experiments. Rice (Oryza sativa) Goal: Rapidly design new crop varieties for a specific environment including “old” environments with a changed climate… Personalized Agriculture July 17 2013 Slide prepared by Alex Feltus
Massive amounts of DNA/RNA/Genetic Data in Databases 1.64 Quadrillion base pairs in 5 yrs! July 17 2013 Slide prepared by Alex Feltus http://www.ncbi.nlm.nih.gov/Traces/sra/
NGS Biomarker Example Datasets A RAW DATA (uncompressed) PROCESSED DATA (compressed) 5.7G Sample_Feltus1_L006_R1.cat.fastq 5.7G Sample_Feltus1_L006_R2.cat.fastq 5.8G Sample_Feltus1_L007_R1.cat.fastq 5.8G Sample_Feltus1_L007_R2.cat.fastq 6.7G Sample_Feltus2_L006_R1.cat.fastq 6.7G Sample_Feltus2_L006_R2.cat.fastq 6.8G Sample_Feltus2_L007_R1.cat.fastq 6.8G Sample_Feltus2_L007_R2.cat.fastq 6.5G Sample_Feltus3_L006_R1.cat.fastq 6.5G Sample_Feltus3_L006_R2.cat.fastq 6.6G Sample_Feltus3_L007_R1.cat.fastq 6.6G Sample_Feltus3_L007_R2.cat.fastq 7.3G Sample_Feltus4_L006_R1.cat.fastq 7.3G Sample_Feltus4_L006_R2.cat.fastq 7.4G Sample_Feltus4_L007_R1.cat.fastq 7.4G Sample_Feltus4_L007_R2.cat.fastq 5.6G Sample_Feltus5_L006_R1.cat.fastq 5.6G Sample_Feltus5_L006_R2.cat.fastq 5.7G Sample_Feltus5_L007_R1.cat.fastq 5.7G Sample_Feltus5_L007_R2.cat.fastq 8.8G Sample_Feltus6_L006_R1.cat.fastq 8.8G Sample_Feltus6_L006_R2.cat.fastq 8.9G Sample_Feltus6_L007_R1.cat.fastq 8.9G Sample_Feltus6_L007_R2.cat.fastq 2.4G Sample_Feltus1_L007_R1.MERGED.BAM 2.4G Sample_Feltus1_L007_R1.MERGED.BAM 2.7G Sample_Feltus2_L006_R1.MERGED.BAM 2.7G Sample_Feltus2_L007_R1.MERGED.BAM 2.6G Sample_Feltus3_L006_R1.MERGED.BAM 2.6G Sample_Feltus3_L007_R1.MERGED.BAM 3.0G Sample_Feltus4_L006_R1.MERGED.BAM 3.0G Sample_Feltus4_L007_R1.MERGED.BAM 2.2G Sample_Feltus5_L006_R1.MERGED.BAM 2.2G Sample_Feltus5_L006_R1.MERGED.BAM 2.9GSample_Feltus6_L006_R1.MERGED.BAM 2.9GSample_Feltus6_L007_R1.MERGED.BAM 6 RNA Samples in Duplicate 163.6 GB (raw) + 31.8 GB (processed) = 195.4 GB of critical data files (<6 hours to process on cluster) Does not include: Intermediate processing files Reference genome (0.72 GB) July 17 2013 Slide prepared by Alex Feltus
The CUTTERS (Kwartowitz) lab is working to enable remote processing of stereo laparoscopic data for real-time feedback with surgical robot systems on partner sites (Vanderbilt, Mayo Clinic) Mayo Clinic, MN Palmetto HPC Cluster Vanderbilt, TN Clemson, SC July 17 2013
How Does It Work Today R&E net 1 R&E net ISP 1 Internet ISP 2 Internet … … G • Down the road • compliances • User-specific privileges • access control Research Network Campus Network Data Center July 17 2013
What Are We Building NOW July 17 2013
Porting GENI Research Prototype to ProductionSOS: Seamless Large Data Transport • Steroid OpenFlow Service (SOS) • by Aaron Rosen and KC Wang • Seamless TCP throughput upgrade, e.g., 2.5 Mbps 120 Mbps • Multipath support • Automatic site agent detection • Upcoming demos of SOS: • NSF 12th GENI conference, Kansas City, MO. • Supercomputing 2011, Seattle, WA. July 17 2013
Condo of Condos:Connecting Campus HPC with SDN July 17 2013
Significance of IT Support Team to Bootstrap Researcher Use of HPC and SDN New Palmetto Cluster Users May 2010: Galen joins CITI and begins recruiting & training users Number of Users
And to Create a Transformative University • a unique coalition among academy, IT, and industrial partners within and beyond Clemson. • Synergy with other university research centers: Cyberinstitute, ICAR, and Watts Innovation Center July 17 2013
Synergy with Cross-Communities Momentum Research Communities Companies Universities . . . Open Source Communities IT Communities July 17 2013
FURTHER QUESTIONSKWANG@CLEMSON.EDU July 17 2013