1 / 24

RIKEN Genomic Science Center Fumikazu KONISHI

Empower biologists with an instant bioinformatics research workbench using Knoppix technology for high-throughput computing, enabling easy setup and collaborative projects without local system impact. Download free Knoppix image with InterProScan and Condor components included for efficient computation. Set up a cluster system for gene functional domain search and database processing, testing configurations using a Parallel File System for seamless operation.

marthalong
Download Presentation

RIKEN Genomic Science Center Fumikazu KONISHI

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Improving the Research Bootstrap of Condor High Throughput Computing for Non-Cluster Experts Based on Knoppix Instant Computing Technology RIKEN Genomic Science Center Fumikazu KONISHI

  2. Background • Biologists need a high performance computing system for their research process. However, they do not know how to build a cluster system by themselves. Condor Week 2006

  3. She is a biologist with a big problem. Meet Chie-san. Condor Week 2006 I borrowed slides from Condor.

  4. Chie-san’s Application … Run a Sequence Sweep of InterProScan for Mouse cDNAs of a total of 103,000 clones . • InterProScan takes on the average 1 minute to compute on a “typical” workstation (total = 103000 × 1 = 103000 minutes = 1716 hours ) • InterProScan requires 6G bytes Public Database set for each. http://www.ebi.ac.uk/interpro/README1.html Condor Week 2006

  5. Technical Skill Barrier Policy Barrier I have 103,000 sequences to search a gene functional domain. And I am Non-Cluster Experts. Who will help me? Condor Week 2006

  6. Getting Knoppix for InterProScan High Throughput computing Edition • Available as a free download from Google Search “fumikazu”. Download the image file. The image includes: • InterProScan4.1 • Condor 6.6.10 • PVFS2 1.2 • Ganglia 3.0.1 Condor Week 2006

  7. Chie-san can boot up by an image of Instant High Throughput Computing with an Application on lab’s machines… She can borrow lab’s computers on weekend without any software installation. Condor Week 2006

  8. Goal • This research goal is to provide an instant high performance bioinformatics research workbench for all biology researchers, and allow us easy setup in collaborative project without side effect to local system. Bioinformatics Condor Week 2006

  9. Instant Setup Technologies • Install-Based Deploy System • RPM-Based automatic configuration technology (Redhat) • NPACI Rocks toolkits (UCSD) • Image-Based Deploy System • Live-CD technology (Knoppix) Condor Week 2006

  10. Key Solutions • Knoppix • A GNU/Linux distribution that construct a machine without hard disk instillation. • Parallel File System • PVFS is intended a high-performance parallel file system for cluster computing. This system provides high bandwidths access and huge volume storage area. Condor Week 2006

  11. Parallel File System on RAM Disk Condor Week 2006

  12. Knoppix for InterProScan4.1 High Throughput Computing Edition

  13. Worker Node PXE Boot Head Node Condor Week 2006 Database download server

  14. Step 1: Booting image Boot the head node, IP address leased by the DHCP server is displayed after the boot sequence. Condor Week 2006

  15. Step 2: after the successful, two setup options—EASY and ADVANCED—are displayed on the screen. Condor Week 2006

  16. Step 3: Boot work nodes All nodes must support PXE boot; The system must automatically assess whether sufficient resources are available for the database arrangement of InterProScan4.1. Condor Week 2006

  17. Step 4: building cluster system Condor Week 2006

  18. Condor Week 2006

  19. Download InterProScan database set Condor Week 2006

  20. Testing The system submits a single test job. The test jobs are completed in a few minutes. The condor job status is displayed on the browser, and Ganglia provides a large amount of information on all nodes. All configurations can be tested in this phase. Condor Week 2006

  21. Results Condor Week 2006

  22. Condor Week 2006

  23. Web site http://big.gsc.riken.jp/index_html/Members/fumikazu/htc Condor Week 2006

  24. Questions Condor Week 2006

More Related