80 likes | 242 Views
High Performance Computing & Bioinformatics Performance Nick Lindberg. “If you were plowing a field, which would you rather use: two strong oxen or 1024 chickens?” -Seymour Cray. HPC: Computing Platforms. Supercomputer (Oxen)
E N D
High Performance Computing & Bioinformatics PerformanceNick Lindberg MCW Bioinformatics User Group
“If you were plowing a field, which would you rather use: two strong oxen or 1024 chickens?”-Seymour Cray MCW Bioinformatics User Group
HPC: Computing Platforms • Supercomputer (Oxen) • Fast, expensive, custom CPU/memory architecture and high-speed interconnect facilitating parallelization of a single computation • Appears as a singular, large computer • IBM “Bluegene” • NCSA “Blue Waters” • HPC Cluster (Chickens) • A lot of cheaper, commodity servers/CPUs with loosely coupled, slower (but still fast) interconnect (Ethernet/Infiniband) • Excels at ‘embarrassingly parallel’ batch computing as well as MPI-enabled parallel computing • Marquette’s “Pere” • MKEI’s “hpc01” MCW Bioinformatics User Group
Matrix Multiplication C[0,0] = A[0,0]*B[0,0] + A[0,1]*B[1,0] + A[0,2]*B[2,0] C[0,0] A[0,o] A[0,1] A[0,2] B[0,0] B[1,0] For (i=0; i<3; i++) { C[i, j] = A[i, k] * B[k, j] B[2,0] MCW Bioinformatics User Group
HPC Cluster Job/Software Stack Internet (Remote Users) Head/Login Node [Job Submission] Scheduler (Torque/Moab) R Bowtie2 Matlab 1 GigE/Infiniband Internode Communication Compute Nodes /scratch Shared Storage: Home Directories, Software, Scratch /usr/home /scratch
Data to Information: Where HPC Fits In Raw Data Target Data Processed Data Transformed Information Patterns Knowledge Pattern Recognition Data Processing Interpretation Sampling on customer devices, servers, end points Feature extraction and content resolution, database construction/storing Dimension reduction, matrix/vector building, pre-process formatting Association classifications, algorithm application, simulation/modeling Visualization and validation, model reconstruction, feedback