180 likes | 261 Views
Lessons Learned from SETI@home. David P. Anderson Director, SETI@home Spaces Sciences Laboratory U.C. Berkeley April 2, 2002. tape archive, delete. tape backup. user DB. science DB. master DB. redundancy checking. DLT tapes. CGI program. acct. queue. result queue. RFI
E N D
Lessons Learned fromSETI@home DavidP. Anderson Director, SETI@home Spaces Sciences Laboratory U.C. Berkeley April 2, 2002
tape archive, delete tape backup user DB science DB master DB redundancy checking DLT tapes CGI program acct. queue result queue RFI elimination garbage collector web page generator splitters repeat detection screensavers WU storage web site data server SETI@home Operations data recorder
History and statistics • Conceived 1995, launched April 1999 • Funding: TPS, DiMI, numerous companies • 3.6M users (.5M active), 226 countries • 40 TB data recorded, processed • 25 TeraFLOPs average over last year • Almost 1 million years CPU time • No ET signals yet, but other results
Public-resource computing • Original: GIMPS, distributed.net • Commercial: United Devices, Entropia, Porivo, Popular Power • Academic, open-source • Cosm, folding@home, SETI@home II • The peer-to-peer paradigm
Characterizing SETI@home • Fixed-rate data processing task • Low bandwidth/computation ratio • Independent parallelism • Error tolerance
Millions and millions of computers • Server scalability • Dealing with excess CPU time • Redundant computing • Deals with cheating, malfunctions • Control by changing computation • Moore’s Law is true (causes same problems)
Network bandwidth costs money • SSL to campus: 100 Mbps, free, unloaded • Campus to ISP: 70 Mbps, not free • First: load limiting at 25 Mbps • Now: no limit, zero priority • How to adapt load to capacity? • What’s the break-even point (1GB per CPU day)
How to get and retain users • Graphics are important • But monitors do burn in • Teams: users recruit other users • Keep users informed • Science news • System management news • Periodic project emails
Reward users • PDF certificates • Milestone pages and emails • Leader boards (overall, country, …) • Class pages • Personal signal page
Let users express themselves • User profiles • Message boards • Newsgroup (sci.astro.seti) • Learn about users • Online poll
Users are competitive • Patched clients, benchmark wars • Results with no computation • Intentionally bad results • Team recruitment by spam • Sale of accounts on eBay • Accounting is tricky
Anything can be reverse engineered • Patched version of client • efforts at self-checksumming • Replacement of FFT routine • Bad results • Digital signing: doesn’t work • Techniques for verifying work
Users will help if you let them • Web-site translations • Add-ons • Server proxies • Statistics DB and display • Beta testers • Porting • Open-source development • (will use in SETI@home II)
Client: mechanism, not policy • Error handling, versioning • Load regulation • Let server decide • Reasonable default if no server • Put in a level of indirection • Separate control and data
Cross-platform is manageable • Windows, Mac are harder • GNU tools and POSIX rule
Server reliability/performance • Hardware • Air conditioning, RAID controller • Software • Database server • Architect for failure • Develop diagnostic tools
What’s next for public computing? • Better handling of large data • Network scheduling • Reliable multicast • Expand computation model • Multi-application, multi-project platform • BOINC (Berkeley Open Infrastructure for Network Computing)