250 likes | 415 Views
Realizing LIGO Virtual Data. Caltech: Roy Williams, Albert Lazzarini, Phil Ehrens, Kent Blackburn ISI: Laura Pearlman, Leila Meshkat, Gaurang Mehta, Carl Kesselman, Ewa Deelman UWM: Scott Koranda, Bruce Allen. Outline.
E N D
RealizingLIGO Virtual Data Caltech: Roy Williams, Albert Lazzarini, Phil Ehrens, Kent Blackburn ISI: Laura Pearlman, Leila Meshkat, Gaurang Mehta, Carl Kesselman, Ewa Deelman UWM: Scott Koranda, Bruce Allen GriPhyN/LIGO prototype 10/2001
Outline • LIGO experiment and LDAS (LIGO Data Analysis System) • General and LIGO specific Virtual Data scenarios • Prototype Overview • User interface • Request interpreter • Request planner (Replica Catalog, G-DAG) • Security model • Request execution (Condor-G/DAGMan) • Future plans GriPhyN/LIGO prototype 10/2001
The Virtual Data Grid (VDG) Model • Data suppliers publish data to the Grid • Users request raw or derived data from Grid, without needing to know • Where data is located • Whether data is stored or computed • Users can easily determine • What it will cost to obtain data • Quality of derived data • VDG serves requests efficiently, subject to global and local policy constraints GriPhyN/LIGO prototype 10/2001
LIGO Experiment(Laser Interferometer Gravitational-Wave Observatory) • Aims to detect gravitational waves predicted by Einstein’s theory of relativity. • Can be used to detect • binary pulsars • mergers of black holes • “starquakes” in neutron stars • Two installations: in Louisiana (Livingston) and Washington State (Hanford) • Other projects: Virgo (Italy), GEO (Germany), Tama (Japan) • Besides the gravitational sensors, many other types • Data collected during experiments is a collection of time series (multi-channel) • Analysis is performed in time and Fourier domains GriPhyN/LIGO prototype 10/2001
raw channels Hz Time LIGO Experiment(Laser Interferometer Gravitational-wave Observatory) archive Interferometer Short time frames Long time frames clean transpose Single Frame Time-frequency Image event DB Find Candidate Store GriPhyN/LIGO prototype 10/2001
managerAPI dataConditionAPI frameAPI statistical summary, down select channels line removal, & bandwidth reduction, concatenate series regression analysis Asst Mgr Asst Mgr LIGO LW LIGO_LW Files E-Mail light-weight Job Status, (XML based) location of results Anonymous FTP, Web Server, E-Mail LIGO LW LDASLIGODataAnalysisSystem GriPhyN/LIGO prototype 10/2001
Virtual Data Scenario • (LIGO) “Conduct a pulsar search on the data collected from Oct 16 2000 to Jan 1 2001” GriPhyN LIGO Data Specification LIGO Data Product XML XML GriPhyN/LIGO prototype 10/2001
The “GriPhyN box” functionality • For each requested data value, need to • Understand the request • Determine if it is instantiated; if so, where; if not, how to compute it • Plan data movements and computations required to obtain all results • Execute this plan • Monitor progress • Make requested value available GriPhyN/LIGO prototype 10/2001
LIGO’s virtual data(in prototype) • Virtual Data Products • Full frame for a specific time interval • Individual channel for a specific time interval • Decimated channel for a specific time interval • Transformations • Extract(channelname, frame) • Concatenate(frame1, frame2) • Decimate(frame) GriPhyN/LIGO prototype 10/2001
GriPhyN/LIGO box functionality • Understand an XML-specified request • Acquire user’s proxy credentials • Consult replica catalog for available data • Construct a plan to produce data not available • Execute the plan • Return requested data in Frame or XML format LIGO Data Product LIGO Specific Data Specification GriPhyN/LIGO XML XML GriPhyN/LIGO prototype 10/2001
GridFTP GRAM GridFTP GRAM/LDAS Storage Resource Compute Resource LDAS HTTP frontend MyProxy server xml Cgi interface Transformation Catalog Planner Monitoring MDS Replica Catalog G-DAG (DAGMan) Executor CondorG/ DAGMan Logs GridCVS GriPhyN/LIGO prototype 10/2001
Request Interpreter <?xml version="1.0"?> <!DOCTYPE LIGO-LW SYSTEM "LIGO-LW.dtd"> <LIGO_LW Name="banana" Type="VirtualDataRequest"> <Time Name="StartTime" Type="GPS">65800000</Time> <Time Name="EndTime" Type="GPS">65800010</Time> <LIGO_LW Type="ChannelSpecification"> <Param Name="Detector">LHO,LLO</Param> <Param Name="RegEx">H:LSC-AS_Q</Param> </LIGO_LW> <Param Name="ResponseFormat"> LIGO-LW </Param> <Param Name="ResponseDomain"> dc-user.isi.edu</Param> <Param Name="ResponseLocation"> //dc-n1.isi.edu/scratch/myfile.xml </Param> </LIGO-LW> GriPhyN/LIGO prototype 10/2001
Request Planning(C_A_100, C_A_101) • Optimize planning decisions with respect to final data destination (final_dest = UWM) • Consult the Replica Catalog for Data Existence • Determines which data needs to be produced (C_A_100 at ISI) (C_A_101 not present) • Map request into a DAG of grid operations that need to be performed • Template instantiation GriPhyN/LIGO prototype 10/2001
Components • Replica Catalog • Template Instantiation • Execution Environment • Security GriPhyN/LIGO prototype 10/2001
LIGO Replica Catalog Structure Replica Catalog Logical Collection For times 638834000- 638834500 Logical Collection 638834500- 638835000 Filename: H-638834071.T Filename: H-638834271.T … Location dataserver.uwm.edu Location dc-n1.isi.edu Logical File Parent Filename: H-638834071.T Filename: H-638834271.T Filename: ….. Protocol: gridftp UrlConstructor: gsiftp:// dataserver.phys.uwm.edu / griphyn_test Filename: H-638834071.T … Filename: Protocol: gridftp UrlConstructor: gsiftp:// dc-n1.isi.edu / pub/ligo2 Logical File H-638834071.T Logical File H-638847271.T Size: 506214400 GriPhyN/LIGO prototype 10/2001
Template instantiation C_A_100 in dc.isi.edu/frames Output location: host.uwm.edu/myframes Abstract G-DAG Concrete G-DAG (DAGMan) globus_url_copy C_A_100 From dc.isi.edu/frames to To host.uwm.edu/myframes globus_url_copy X From a to b Register X In RC with location b Register C_A_100 In RC with location host.uwm.edu/myframes GriPhyN/LIGO prototype 10/2001
globus_url_copy X From a to b globus_url_copy X From a to b globus_url_copy Y From c to b Execute LDAS_concat X,Y at b Execute LDAS_extract C_Y from X at b globus_url_copy X From a to b globus_url_copy Z From b to d globus_url_copy C_Y From b to c Execute decimate on X at b Register Z In RC with location d Register C_Y In RC with location c Templates globus_url_copy X From a to b GriPhyN/LIGO prototype 10/2001
LDAS GRAM Caltech GridFTP LDAS GRAM UWM DAGMan Condor-G GridFTP UWM GridFTP DAGMan Compute Resources GRAM UWM GridFTP Compute Resources Execution Environment GRAM ISI GridFTP ISI GridFTP GriPhyN/LIGO prototype 10/2001
Security Model User needs to register a proxy certificate With the MyProxy server User’s username and password (ssl connection) Authentication between Prototype and MyProxy Prototype MyProxy Server Prototype: userid and certificate User’s user name and password User’s proxy credential gsi authentication $X509_USER_PROXY =User’s proxy credential GridFTP Condor-G GriPhyN/LIGO prototype 10/2001
Secure, GSI-enabled interface to LDAS Condor-G RSL specified job managerAPI GRAM Asst Mgr Asst Mgr LDAS commands email LIGO LW LIGO_LW Files E-Mail gsiftp:// light-weight Job Status, GridFTP (XML based) location of results Data in LDAS space Anonymous FTP, Web Server, E-Mail LIGO LW Globus LDAS/Globus Interface Tcl UserAPIs Job/ID LDAS GriPhyN/LIGO prototype 10/2001
The JOB ID = 11397 the Desired channel name is H2:LSC-AS_Q with timestamp 65800000 writing sub file transfer_a2b_1.11397.sub WILL TRANSFER H-65800000.F from gsiftp://dc-n1.isi.edu/ligodata/frames/H-65800000.F to gsiftp://dataserver.phys.uwm.edu/grid_incoming/H-65800000.F Writing Sub File transform.11397.sub my infile is H-65800000.F, my outfile is H-H2:LSC-AS_Q-65800000.xml11397 Will apply Transformation WILL GET RESULT H-H2:LSC-AS_Q-65800000.xml from gsiftp://dataserver.phys.uwm.edu/grid_outgoing/H-H2:LSC-AS_Q-65800000.xml11397 to gsiftp://dc-n1.isi.edu/OUTPUT/H-H2:LSC-AS_Q-65800000.xml the data will be available at http://dc-n1.isi.edu/OUTPUT/H-H2:LSC-AS_Q-65800000.xml **********JOB SUBMITTED************ GriPhyN/LIGO prototype 10/2001
Outlook Short term (SC’2001) • Integrate with RC • Integrate with MyProxy • Better job monitoring Longer term • Support a richer set of Virtual Data products • Implement and incorporate the Transformation Catalog • Investigate Derived Data Catalog (abstract DAGs) • Investigate more complex planning techniques • Query estimation • Execution • Error handling and recovery • Alternative strategies GriPhyN/LIGO prototype 10/2001
Prototype to Tool • Enable a close collaboration between LIGO participants and other gravitational wave communities such as Virgo and GEO • Replica Catalogs to provide information about data existence (Register new selected data in the catalog to make it available) • Including Materialized Data (no need to recompute) • Provide access to data analysis software and systems (Transformation Catalog) • Use the prototype as the basis for a data exchange and replication system (security) • Provide access to world-wide computing resources GriPhyN/LIGO prototype 10/2001
DEMO TONIGHT GriPhyN/LIGO prototype 10/2001