Lessons learned from SETI@home

Lessons learned fromSETI@home David P. Anderson January 31, 2002

tape archive, delete tape backup user DB science DB master DB redundancy checking DLT tapes CGI program acct. queue result queue RFI elimination garbage collector web page generator splitters repeat detection screensavers WU storage web site data server SETI@home Operations data recorder

Radio SETI projects

History and statistics • Conceived 1995, launched April 1999 • Funding: TPS, DiMI, numerous companies • 3.5M users (.5M active), 226 countries • 40 TB data recorded, processed • 25 TeraFLOPs average over last year • No ET signals yet, but other results

Public-resource computing • Original: GIMPS, distributed.net • Commercial: United Devices, Entropia, Porivo, jxtp, Popular Power • Academic, open-source • Cosm, folding@home, SETI@home II • The peer-to-peer paradigm

Characterizing SETI@home • Fixed-rate data processing task • Low bandwidth/computation ratio • Independent parallelism • Error tolerance

Be prepared for crowds • Server scalability • Dealing with excess CPU time • Redundant computing • Deals with cheating, malfunctions • Control by changing computation • Moore’s Law is true (causes same problems)

Network bandwidth costs money • SSL to campus: 100 Mbps, free, unloaded • Campus to ISP: 70 Mbps, not free • First: load limiting at 25 Mbps • Now: no limit, zero priority • How to adapt load to capacity? • What’s the break-even point (1GB per CPU day)

How to get and retain users • Graphics are important • But monitors do burn in • Teams: users recruit other users • Keep users informed • Science news • System management news • Periodic project emails

Reward users • PDF certificates • Milestone pages and emails • Leader boards (overall, country, …) • Class pages • Personal signal page

Let users express themselves • User profiles • Online poll • Newsgroup (sci.astro.seti) • Message boards • Learn about users

Users are competitive • Patched clients, benchmark wars • Results with no computation • Intentionally bad results • Team recruitment by spam • Sale of accounts on eBay • Accounting is tricky

Anything can be reverse engineered • Patched version of client • efforts at self-checksumming • Replacement of FFT routine • Bad results • Digital signing: doesn’t work • Techniques for verifying work

Users will help if you let them • Web-site translations • Add-ons • Server proxies • Statistics DB and display • Beta testers • Porting • Open-source development • (will use in SETI@home II)

Client: mechanism, not policy • Error handling, versioning • Load regulation • Let server decide • Reasonable default if no server • Put in a level of indirection • Separate control and data

Cross-platform is manageable • Windows, Mac are hard • GNU tools and POSIX rule

Server reliability/performance • Hardware • Air conditioning, RAID controller • Software • Database server • Architect for failure • Develop diagnostic tools

What’s next for public computing? • Better handling of large data • Network scheduling • Reliable multicast • Expand computation model • Multi-application platform • Economic model

Lessons learned from SETI@home

Lessons learned from SETI@home

Presentation Transcript

Lessons Learned from Clippit

Lessons Learned from SETI@home

Lessons Learned from Organic Synthesis

Lessons learned from Jonah

Lessons Learned From Israel

Lessons Learned from SOFIA

Lessons Learned from the field

SETI Sneak Attack: Lessons Learned from The Pearl Harbor Hoax

Lessons learned from SMEI

Lessons Learned From Federal Programs

Lessons Learned from Blended Learning

Lessons learned from the

Lessons learned from station calibration

Lessons learned from Semantic Wiki

Lessons learned from SETI@home

LESSONS LEARNED FROM

Lessons learned from rebuilding ArchiveGrid

Lessons Learned from Data Standardization

Lessons Learned from Other Countries

Lessons Learned from Clippit

LESSONS LEARNED FROM ORGANIZATIONS

Lessons Learned from SC2004