150 likes | 262 Views
AstroGrid’s trial Data Grids. Experiences with GT3 Guy Rixon July 2003. Themes. Grid metaphor; approaches OGSI cone-search services OGSA-DAI OGSI image-servers “MySpace” The quality problem. What does “data grid” metaphor mean?. Or. Simple Uniform Accessible. Complex
E N D
AstroGrid’s trial Data Grids Experiences with GT3 Guy Rixon July 2003
Themes • Grid metaphor; approaches • OGSI cone-search services • OGSA-DAI • OGSI image-servers • “MySpace” • The quality problem
What does “data grid” metaphor mean? Or • Simple • Uniform • Accessible • Complex • Heterogenous • Hidden
What’s the commodity? “Grid” implies and requires some commodity! • Compute cycles: • S/w installations: • Archived data: • Views of data: • Storage: • Transport:
Types of service On-line services A/G 2002 prototypes Data-selection services OGSA-DAI SOAP OGSI OGSA-DAI Trad. Data centres A/G 2003 prototypes
Cone search services • Data-selection grid services • For tabular data • Initially three data-sets • One service per data set • Metadata only to browser • Bulk results via file servers • Only HTTP. • No data consumers!
Cone-search services (2) • Semi-OGSI construction: • Built with Globus Toolkit 3 alpha 2. • Java grid-services in Tomcat call Perl scripts. • RPC-style operations. • Data management built into services • Done by local scripts, not by grid services • Conflates data and service grids • Easy coding! • Services: trivial • Clients: harder
OGSA-DAI • Brief flirtation, ran out of time. • Successfully wrapped FIRST object-catalogue (~80MB, 750, 000 rows) • Tried to make cone-search client: too hard! • Data grid: • Not possible with OGSA-DAI 1.5 • Try again with v2.x • Document-based service: hard to write clients for.
Anglo-Australian demo • Drives a volumetric renderer with service and data grids. • In collaboration with AusVO (Barnes et al.) • To be shown at IAU 2003.
Anglo Australian Demo (2) • Uses Globus Toolkit 3 beta. • Uses GridFTP. • RPC services throughout. • Stateful (tidies up locks and temp. files). • Some security. • Infrastructure grief limits development. • See it at: • IAU 2003 • Super Computing 2003 • All-hands 2003
“MySpace” (2) • Pure data-grid. • Handles files and relational tables. • Tree of logical files spans VO. • Permanent “home” space + leased “scratch” space. • Relational (OGSA-DAI) nodes embedded in tree. • One MySpace agent at each service site. • Central registry of MySpace contents. • MySpace Explorer; c.f Windows Explorer.
Lessons • Services are easy; clients are hard. • Registry matters, even at trivial scale. • Web portal is very hard. • Quality threshold is very high. • Small errors get magnified in big grid. • State holding makes it hard to recover errors. • System is only ever as good as its error reports. • Infrastructure failures hard to trace and fix. • Results limited by infrastructure quality. • GT3: arghhhhhhhhhh! Noooooo!
The quality crisis Stateful, distributed, semi-autonomous systems demand extreme quality. Current Grid infrastructure does not meet this standard. Old style… New style… You are here You are here (old tech.) You are here (new tech.)
Responses to quality crisis • Wait for better infrastructure? • But: “Good software takes 10 years” — Spolsky • Avoid grid? Or just GT3? • Give up? • Simplify? • Take control: • Make own infrastructure to sufficient standard • Get APIs standardized, then done properly. • (Delegate to industrial partners?) • Only return to application programming when infrastructure is sound.