1 / 22

DM/SMD section meeting, 18/1/08

DM/SMD section meeting, 18/1/08. German Cancio. Outline. Your new SL The section: Mandate, people, activities, MARS Assignments of people Planning and tracking Discussion Note: regular section Meetings: Bi-weekly, Fridays @ 14:00 in 513-1-027, starting next week. $ finger gcancio.

Download Presentation

DM/SMD section meeting, 18/1/08

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DM/SMDsection meeting, 18/1/08 German Cancio

  2. Outline • Your new SL • The section: Mandate, people, activities, MARS • Assignments of people • Planning and tracking • Discussion • Note: regular section Meetings: • Bi-weekly, Fridays @ 14:00 in 513-1-027, starting next week

  3. $ finger gcancio • spanish/swiss, time_t gcancio.tcreate() ==16934400, grew up in Spain->Switzerland->Spain • Computer Science studies in Madrid, 1 semester in Chambery • Focus on Artificial Intelligence, UNIX • 1995-6: Undergraduate fellowship @ UPM Madrid • Work on intelligent client/server information delivery middleware and object-oriented rule-based expert systems • 1997-9: tech student, then fellow @ CERN/IT • MSc on automated software build + distribution/installation system development (ASIS), service management (~4K nodes) • LINUX kernel module programming; AIMS installation server • 2000-present: staff member @ CERN/IT • 2000-2003: EU DataGrid project: Architect, then deputy WP Manager for Fabric Management; participation to the overall DataGrid architecture. Tools: LCFG, Quattor, Lemon • 2004-2005: Deployment of Quattor/Lemon at CERN-CC, replacing legacy solutions (SUE/ASIS). Deployment of Quattor outside CERN (now ~25 sites) • 2006-2007: Section Leader IT/FIO/FD, looking after CASTOR + Fabric Management developments (ELFms: Quattor+Lemon+LEAF), Remedy, SLS, SDB

  4. The section: Mandate • No formal mandate yet, but cf. slides from last Group Meeting: • “Enable synergies between the merged development teams” • “Strong focus on ongoing activities: maintenance of Grid software (DPM, LFC, FTS and lcg_utils) and Castor2 as the maintained production environment.” • “Data Management project [led by Jean-Philippe] is attached to the SMD section and will focus in research and development activities in strategic directions”

  5. The section: People • 10 LD staff, 1 IC • 4 fellows (-1 in 03/08) • 1 technical student, 1 visitor • Manpower retention and replacement (contract/people) will be a challenge

  6. The section: Activities 4 main areas: • CASTOR(2) • first LHC production environment! • Address user + operational problems (at CERN and T1’s using CASTOR) with top priority • Prioritise RFE’s and complete agreed developments • Grid Data Management tools • DPM/LFC, FTS, lcg_utils/GFAL • Address user/operational problems and RFE’s • Fulfill EGEE commitments (e.g. biomed requirements, management/accounting overhead) • Build/test environments • For all DM group developments; investigate and maintain • Data Management R&D project • Preparing for the medium/long term future • 2008 investigation areas: Tape interface, remote access protocols, file systems (local/shared) • Activities and plans to be defined by DM PL (Jean-Philippe)

  7. Activities: CASTOR2 • Coordination with operations team at CERN and T1’s • monthly deployment meeting; daily morning meeting; bi-weekly external operations phonecon (bi-weekly); bi-yearly f2f workshop • Coordination with users • Stabilize new 2.1.6 version, increase maturity for CCRC08 phase II (2.1.7?) • Synergies with DPM • Complete common RFIO project (Q1 08); RFIO extensions (e.g. complete DIRECT_IO) • Patch exchange with LFC / DPMns (Q1 08) • Pending developments • Security – complete strong authentication; VOMS authorization incl. ‘legacy’ mode (Q2 08); study CUPV replacement • Repack2 - evaluation beginning of March. Understand what problems we have and where they originate • VDQM2 re-engineering and extensions - evaluation beginning of April • Complete xrootd plugin developments following ALICE requirements • Closely follow SRMv2 developments at RAL • Taking over unmaintained components and apply agreed bugfixes/RFE’s • VMGR, rtcpd/rtcpclientd (Q3 08) • Other (possible) developments: • Rewrite tape daemon (TBD, priority to be understood) • Oracle Advanced Queueing (message queuing) • 3rd-level support + rota • Client porting to other platforms • MacOSX, priority unclear

  8. Activities: Grid DM tools • DPM • Common RFIO • Developments: SRMCopy; complete DPM->DICOM stager IF (Q1 08); (Pool-based) quotas (Q2 08); bandwith/stream or user limitations • Improved admin tools (e.g. dpmns/disk server consistency) • DPM performance analysis • LFC • Data encryption (Q1 08) • Patch exchange with Castor ns (Q1 08) • FTS • Developments: Adding space tokens (Q2 08); GridFTP/SRM separation (Q1 08); static clouds+non-shared star channels (Q1 08) • Maintenance/support of web services part (was Gavin) • Cross-site deployment (e.g. schema upgrades) • Documentation/FAQ’s for site admins and users (and for 2nd level support) • Lcg_utils + GFAL • Thread-safe library • Misc developments as requested by LCG/GSSD • All: Support • 1st/2nd level support (was Sophie/Gavin - to be outsourced to GS?) • 3rd level support (via

  9. Activities: Build/test environments • Build/test • Maintain current ETICS based build system (for Grid DM tools) • Maintain current build/test system for CASTOR • Investigate common solutions for the whole group (not only the section)

  10. The section: MARS MARS timelines: Interviews to be completed March 14th Written appraisals to be signed by GL’s by April 15th All staff must be interviewed, including those with expiring contracts (“abbreviated MARS”) Interviews: Staff coming from FIO: 2007 results and 2008 objectives with me Staff coming from GD: 2007 results with your previous supervisor, 2008 objectives with me Proposal for dates to be circulated next week. Fellows: Appraisal interview after 6 and 18 months of contract Personally, prefer every 12 months, just after completing the staff MARS cycle

  11. The section: Allocations of people • Disclaimer: Allocations are provisional and can be discussed over the next days. They may later require adjustements and/or changes, in particular after Q2’08 • Attempting to spread task assignments between ex-FIO and ex-GD team members, share CASTOR / Grid DM tools / DM R&D activities • sharing will gradually increase over time • Matrix structure: SL to work closely with DM project leader to ensure right allocations to activities with appropriate priority

  12. People: staff • Akos: • Castor2 security (finishing/deploying strong authentication, VOMS integration), study CUPV replacement options • Build/test: ETICS; strategy in DM group – task coordinator • EGEE-II JRA1 representative • FTS + LFC web services support • DM R&D: filesystems, remote access protocols • Rosa: • Castor2 security (with Akos) • Castor2 3rd-level support • Lcg_utils + GFAL (TBD) • DM R&D (TBD) • Robert: • LFC/DPM testing – SAM interfacing

  13. People: staff (II) • Giuseppe (staff from 3/08 on) • Castor2 support/maintenance and core framework • SRMv2 maintenance – RAL backup • DM R&D: remote access protocols, Oracle adv queueing (TBD) • Steven: • VDQM2 re-engineering, productization, prioritisation • “intermediate layer”: VMGR, rtcpd/rtcpclientd maintenance • DM R&D: TBD • Krzysztof: • DPM->DICOM stager • CSEC extensions (with Lana) • FTS maintenance / extensions

  14. People: Staff (III) • Andreas: • XROOTD (both DPM and CASTOR) • DPM bugfixes/support (http(s)) • DPM (and CASTOR?) performance testing • DM R&D: Client access protocols, remote fs (e.g. LUSTRE) – task coordinator • Sebastien: • CASTOR2 support/maintenance/development coordination • Grid Data Management responsibilities, starting with FTS (~Q3 08) • David: • Common RFIO merge, RFIO maintenance (both DPM/CASTOR – e.g. DIRECT_IO for XFS) – task coordinator • Patch exchange with CastorNS • DPM development + support • DM R&D: Client access protocols

  15. People: Staff (IV) • Giulia: • Repack2 • RFIO common (with David) • Castor2 support/maintenance • Build/test frameworks (with Akos) • Dennis: • Castor2 support/maintenance • Patch exchange with LFC / DPMns • DM R&D: Tape (taped, TBC), tape/robotic infrastructure

  16. People: Fellows, tech student • Lana: • LFC (support, data encryption) • DPM support, performance testing (with Andreas) • DM R&D test/prototyping e.g. filesystems • Chang: • DM testing • Marisa: • CASTOR2 – until end 02/08 • Remi: • GFAL + lcg_util • CSEC maintenance • DPM client tools maintenance • VOMS service manager (EGEE-II) • Grid security (TBC) • DM R&D • Paolo: • FTS development • FTS support • Build/test frameworks (TBD)

  17. Plans, support, tracking • Bugs/RFE’s, TODO’s, work plans and priorities: • Currently spread across different activities, following different formats, using different tools, discussed in different forums.. Room for streamlining? • Twiki, Savannah/CASTOR, Savannah/DM, EGEE… • Sharing support activities • Each developer should be at least on two support lists • Activity tracking: • Light-weight but regular activity tracking, allowing to understand who has been working on what (development, deployment/support, meetings, etc) and compare with planning • cf next slide – data collected during the Castor review 2006

  18. R 16: Credible WBS (III) • Examples: • Senior developer (Giuseppe)

  19. Allocation: 100% Castor means in reality ~ 95% Planned tasks: Only 23% of Castor-dev time (21% total time) Not in plan: 73% (68%) R 16: Credible WBS (III) • Examples: • Senior developer (Giuseppe)

  20. Not planned tasks: Most of them of ongoing nature (eg support, releases) development+deployment vs. support: 45% of Castor-dev time spent on support activities R 16: Credible WBS (III) • Examples: • Senior developer (Giuseppe)

  21. Allocation: 88% (training) Planned tasks: 43% of Castor-dev time Not planned : 53%. Out of which 45% development activities (mostly unforeseen) and 8% support development+deployment vs. support: Only 15% goes to support R 16: Credible WBS (IV) • Examples: • Junior developer (Giulia)

  22. Questions || Discussion

More Related