230 likes | 353 Views
Auger & XtremWeb: Monte Carlo computation on A Global Computing platform O. Lodygensky, G. Fedak, V. Neri, Cordier, F. Cappello Laboratoire de l’Accelerateur Lineaire; Laboratoire de Recherche en Informatique; CNRS, Université Paris sud, France. Sommaire. Introduction XtremWeb
E N D
Auger & XtremWeb: Monte Carlo computation on A Global Computing platform O. Lodygensky, G. Fedak, V. Neri, Cordier, F. Cappello Laboratoire de l’Accelerateur Lineaire; Laboratoire de Recherche en Informatique; CNRS, Université Paris sud, France.
Sommaire • Introduction • XtremWeb • Auger distributed computing • Conclusion
Different GRID Nodes caracteristics Traditionnal computing centers, Clusters • <100 • Stables • Individually identified • Trusted « GRID » 2 distributed system types Global computing systems Windows, Linux, Mac OS • ~100 000 • Volatiles • No individualident • Not trusted « Desktop GRID » « Internet Computing » Peer to Peer systems (P2P) CHEP2003 - O.Lodygensky
Desktop GRID One server centralizes scheduling On volunteer PCs • Dedicated applications • SETI@Home, distributed.net, • Decrypthon • Production projects • Folding@home, Genome@home, • Folderol, • Open source/research projects • XtremWeb, BOINC, • Commercial platforms • Entropia, Datasynapse, • United Devices, Grid systems Client Application set params. / get results. server parametres Internet VolunteerPC VolunteerPC Volunteer PC : load & exec task CHEP2003 - O.Lodygensky
Desktop Grid characteristics • Scalability : up to 100 k, 1 M hosts • Heterogeneity : different hardwares, OSes • Volatility : unpredictable participant behaviour • Seti@home, Napster, Kazaa, etc. : they work well despite volatility. • Perenity : developments and upgrades must be easy • Performances : Seti@home ~30 Tflops, • Kazaa (1 M users : 100kb/s, 1Mb/s 100 Gb/s, 1 Tb/s?). • Sécurity : • Volunteer PCs and servers integrity ; • Prevent application & results corruption ; • Authentication. CHEP2003 - O.Lodygensky
Sommaire • Introduction • XtremWeb • Auger distributed computing • Conclusion CHEP2003 - O.Lodygensky
XW : Architecture • Centralized • Global Computing (Peer to Peer) • 3 entities : client/coordinator/worker Hierarchical Coordinator P2P Coordinator Global Computing coordinator (client) PC Client/worker Internet / LAN PC Worker PC Worker PC Client/Worker CHEP2003 - O.Lodygensky
Data Base Server SQL Java JDBC Java XML-RPC SSL Java PHP3-4 GNU autotool Communication protocol Http Server Worker Client Installation XW : Technology Pre requisite for installation: database (Mysql), JAVA > jdk1.2. CHEP2003 - O.Lodygensky
XW : Security Coordinat. ssh ssh Worker Sandbox (SBLSM) Client LoadedApp CHEP2003 - O.Lodygensky
XW : fault tolerance model • Every entity is volatile by essence • Connectionless protocols => All entities are stand alone Submit task Worker1 Client Get work Coord. Sync/Retrieve result Put result Sync/Retrieve result Sync/Get work Client2 Put result Worker2 Coord. Sync/Submit task Sync/Get work Sync/Retrieve result Sync/Put result CHEP2003 - O.Lodygensky
Sommaire • Introduction • XtremWeb • Auger distributed computing • Conclusion CHEP2003 - O.Lodygensky
Pierre Auger Observatory Understanding the origin of very high cosmic rays: • Aires: Air Showers Extended Simulation • Sequential, Monte Carlo. Time for a run: 5 to 10 hours Air shower parameter database (Lyon, France) Traditional Super Computing Centers XtremWeb Server CINES (Fr) Estimated PC number ~ 5000 air shower PC worker Internet and LAN Fermi Lab (USA) PC Client PC Worker PC worker Aires CHEP2003 - O.Lodygensky
Auger-XW (AIRES): High Energy Physics • Application : • AIRES • Deployment: • Coordinator at LRI • Madison:700 workers • Pentium III, Linux • (500 MHz+933 MHz) • (Condor pool) • Grenoble Icluster: 146 workers • (733 Mhz), PBS • LRI: 100 workers • Pentium III, Athlon, Linux • (500MHz, 733MHz, 1.5 GHz) • (Condor pool) Icluster Grenoble PBS Madison Wisconsin Condor U-psud network Internet LRI Condor Pool Other Labs lri.fr XW Coordinator XW Client CHEP2003 - O.Lodygensky
Auger-XW (AIRES): High Energy Physics CHEP2003 - O.Lodygensky
Auger-XW (AIRES): High Energy Physics CHEP2003 - O.Lodygensky
Auger-XW (AIRES): High Energy Physics CHEP2003 - O.Lodygensky
Auger-XW (AIRES): High Energy Physics CHEP2003 - O.Lodygensky
Auger-XW (AIRES): High Energy Physics CHEP2003 - O.Lodygensky
Auger-XW (AIRES): High Energy Physics CHEP2003 - O.Lodygensky
Auger-XW (AIRES): High Energy Physics CHEP2003 - O.Lodygensky
Sommaire • Introduction • XtremWeb • Auger distributed computing • Conclusion CHEP2003 - O.Lodygensky
Conclusion XtremWeb : a « desktop Grid » platform Fault tolerance. XtremWeb : « connectionless » + « restartable » Security : certificats + crypto + sandbox +… • What we have learned so far with XtremWeb: • Deployment is critical • When they understand the computational power potential , users rapidly ask for more resources!!! XtremWeb Auger: • International Desktop GRID • Condor pools with XW as global infrastructure • Good performances (ratio 1:60 with several small hosts than the reference) =>Schedulling is a lack of XtremWeb <= =>Strong need of results browsing tools <= CHEP2003 - O.Lodygensky
Software • XtremWeb : www.XtremWeb.net • Since 2001 • Acual version : 1.2.rc0 CHEP2003 - O.Lodygensky