1 / 3

Alexei Klimentov Brookhaven National Laboratory

< Transient> Datasets Deletion and Tasks Obsoletion Procedures ADC weekly meeting 18 March 2014. Alexei Klimentov Brookhaven National Laboratory. Introduction. General procedure for tasks obsoletion Task is obsoleted by task submitter

joy
Download Presentation

Alexei Klimentov Brookhaven National Laboratory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. <Transient> Datasets Deletionand Tasks Obsoletion ProceduresADC weekly meeting18 March 2014 Alexei Klimentov Brookhaven National Laboratory

  2. Introduction • General procedure for tasks obsoletion • Task is obsoleted by task submitter • obsoleted tasks are checked twice per week and corresponding datasets are marked for deletion • Initial scenario • 1 week grace period before dataset is deleted isn’t respected anymore • HLT tasks • 3 months lifetime. Weekly tasks obsoletionwith 1 week grace period • Group production • Transient datasets deletion. Every 2 weeks, no grace period (with exception list from GPC) • Sporadically. Semi-automatic. Datasets patterns are defined by Group Production Coordination and kept in SVN repository. 1 week grace period • Reprocessing • After Reprocessing Coordinator confirmation that period (or campaign) hasfinished • Final datasets produced and validated • Dataset pattern(s) is provided • MC transient datasets deletion (unmerged HITS and unmerged AOD) • 2009 procedure (BPK scripts to update postproduction task field, followed by tasks obsoletion or/and datasets deletion) • 2013 fall : automatic transient datasets deletion (replacement of BPK scripts) • Use cases : mc%AOD. mc%HITS, AF • MC Production team defines Project name(s) • Several “dry runs” in Jun, Oct, Nov to validate the new procedure • In production since Dec 2013. Monthly check. • Steady concern and uncertainty : task request table has ‘container like dataset name’ as INPUTDATASET task parameter, though in many cases (~probably all) TID dataset is used. Alexei Klimentov

  3. March 4-8 MC Tasks Obsoletion • Tasks ID range : 1.350.000 – 1.435.000 • 12708 tasks • 15% tasks with project mc12_14TeV • New project since Feb check • 2.7 PB deleted (according to TK) • “too much deletion” : 48 tasks (identified by RW) • The earliest input dataset task on Dec 22 • The latest input dataset task on Jan 3 • Actions : Revisit the procedure • Add extra check for AF tasks • Add extra check for ALL tasks • Parent task ID check • Thanks to Rod, Sasha, Wolfgang for suggestions/discussion • Recovering : • Rerun fast simulation tasks • Interactive cloning of tasks from the above list • Script to reinsert database records • April deletion : dry run first Alexei Klimentov

More Related