1 / 22

By: Solomon Mikael (UMBC) Advisors: Elena Vataga (UNM) & Pavel Murat (FNAL)

Development of Farm Monitoring & Remote Concatenation for CDFII Production Project. By: Solomon Mikael (UMBC) Advisors: Elena Vataga (UNM) & Pavel Murat (FNAL). Outline. CDF Experiment CDF Production Farm Goals & Structure Issues with Concatenation My Contributions 1 Control & Monitoring

mareo
Download Presentation

By: Solomon Mikael (UMBC) Advisors: Elena Vataga (UNM) & Pavel Murat (FNAL)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Development of Farm Monitoring & Remote Concatenation for CDFII Production Project By: Solomon Mikael (UMBC) Advisors: Elena Vataga (UNM) & Pavel Murat (FNAL)

  2. Outline • CDF Experiment • CDF Production Farm Goals & Structure • Issues with Concatenation • My Contributions 1 • Control & Monitoring • My Contributions 2 • Summary • Acknowledgments Solomon Mikael

  3. Goal of CDF Prodution Farm • The main goal of the Production Farm is the reconstruction of available data for physics analysis as soon as possible, reprocess data when necessary, and generate Monte Carlo events Solomon Mikael

  4. CDF Experiment • The CDF ( collider detector experiment at FermiLab) is an international collaboration involving many universities and national laboratories • 2 intense beams of protons and anti-protons meet head on in the middle of the 100 ton solenoidal CDF detector • In order to observe the particles there are layers of subdetectors in the CDF detector each layer responsible for the detection of a different particles properties. • Information from 1000000 electronic channels is recorded Solomon Mikael

  5. Production Farm - Hardware • the Production Farm consists of 150 dual CPU PC's with a total computing power of 800 GHz • The high throughput Linux clusters are used for event reconstruction and analysis • The production farm PC have a total of 25 TB of disk space Solomon Mikael

  6. Production Farm - Software • the CDF production farm performs computing and network intensive tasks in a cost effective manner • SAM (sequential data access via metadata) a data handeling system organized as set of servers working together to store and retrieve files. SAM mitigates the problem of one person hogging the tape drives and/or flooding the tape system. SAM provides tools for the database bookkeeping. • CAF (CDF Analysis Farm) is software and control systems for batch job submission on top of Condor batch system. Solomon Mikael

  7. Stucture of CDF Farm Solomon Mikael concatenation

  8. Issues with Concatenation • In the present scheme concatenator and tape uploader are running at the same time resulting in limited I/O from the stager. • Disk access rate depends on the number of simultaneous I/O operations from the disk RAID 5. • My project at CDF entailed removing the load of concatenation from the stagers to CAF to achieve higher data flow rates. Entries tape transfer rate (MB/s) tape transfer rate (MB/s) Solomon Mikael

  9. Structure Stager Worker • mergeSubmit.py • Analyze input directory • Creates .tcl • Send CAF job • Script copies input files • Run binary code for concatenator • Copy output file to stager Monitoring Solomon Mikael

  10. Required Skills • Before I could make any changes to the CDF farm it was imperative I learned how the individual parts of the farm operated and how they are interrelated: • Effective use of bash scripts & awk text editor • Learning python to modifying the concatenating script MergeSubmit.py • Modifying Tikiwiki pages using the online Tikiwiki editor & web pages Solomon Mikael

  11. What’s BASH & AWK • Shell is a program which interprets commands, either typed in directly by the user or contained in a file called a shell script. • Awk named after its developers ( Aho, Weinberger, and Kernighan ) is a programming language which permits manipulation of structured data and generation of formatted reports. A pattern scanning and processing language Solomon Mikael

  12. #!/bin/bashname=`basename $0`. ./cdfopr/scripts/common_procedures. ./cdfopr/scripts/parse_parameters $* > temp_parse_logecho $TCL_FILEecho $PARAM_USERecho $PARAM_HOSTecho $PARAM_PATHcmd="fcp -c ${RCP} ${FCP_USER}@${FCP_HOST}:${PARAM_PATH}/${TCL_FILE} .";$cmdSTATUS=$?;if[ STATUS -ne 0 ];then    echo "$TCL_FILE was not able to be copied"    exit 1;fifor file_loc in `grep "include file /" ${TCL_FILE} | awk '{print $3}'`;do    cmd="fcp -c ${RCP} ${FCP_USER}@${FCP_HOST}:${file_loc} . ";    $cmd    STATUS=$?;    if [ STATUS -ne 0 ];then    echo "$file_loc was not able to be copide    fidone the_num=2;export JOB_NUM=3echo $SEGMENT_NUMBERecho $JOB_NUMtemp.awk -v seg_num=$SEGMENT_NUMBER --------------------------------------------------------------- #!/bin/awkBEGIN {  flag = 0;}/SEGMENT_NUMBER/ {  if ($5 == seg) {    flag = 1    }}/include/ {  if (flag == 1) {    print $0  }}{  if ($1 == "}") {     flag = 0  }#  if (($4 == "==") && (ENVIRON["SEGMENT_NUMBER"] == $5)) {print $0}} Solomon Mikael

  13. Example .tcl File if { $env(SEGMENT_NUMBER) == 1 } {#------------------------------------------------------#    OutputDir = "/export/data1/cdfmc/concatTest/Monte_Carlo_Test1/mergeLogs/hphysr_0y_01/tmp" ;#    total size:     27780#------------------------------------------------------    set DATASET xbck0y    include file /export/data1/cdfmc/concatTest/Monte_Carlo_Test1/xbck0y/reco.xy0339c5.0284bck0    include file /export/data1/cdfmc/concatTest/Monte_Carlo_Test1/xbck0y/reco.xy0339c5.028ebck0    include file /export/data1/cdfmc/concatTest/Monte_Carlo_Test1/xbck0y/reco.xy0339c5.0298bck0    include file /export/data1/cdfmc/concatTest/Monte_Carlo_Test1/xbck0y/reco.xy0339c5.02a2bck0    include file /export/data1/cdfmc/concatTest/Monte_Carlo_Test1/xbck0y/reco.xy0339c5.02acbck0  } if { $env(SEGMENT_NUMBER) == 2 } {#------------------------------------------------------#    OutputDir = "/export/data1/cdfmc/concatTest/Monte_Carlo_Test1/mergeLogs/hphysr_0y_01/tmp" ;#    total size:     1380315#------------------------------------------------------    set DATASET xbhd0y    include file /export/data1/cdfmc/concatTest/Monte_Carlo_Test1/xbhd0y/reco.xy0339c5.027abhd0    include file /export/data1/cdfmc/concatTest/Monte_Carlo_Test1/xbhd0y/reco.xy0339c5.0284bhd0    include file /export/data1/cdfmc/concatTest/Monte_Carlo_Test1/xbhd0y/reco.xy0339c5.028ebhd0    include file /export/data1/cdfmc/concatTest/Monte_Carlo_Test1/xbhd0y/reco.xy0339c5.0298bhd0  } Solomon Mikael

  14. Control & Monitoring • Tikiwiki software is used for web based documentation • The tiki database keeps a history of all changes to the Farm Projects • Tiki pages enable users to: • keep track of all existing projects • Start or stop a project • Change resource sharing between projects • Redirect output to another stager • Forward execution to CAF without having to connect to the main server • Python’s extensive support for XML, email, RSS feeds and many other Internet protocols make it effective for developing custom web solutions Solomon Mikael

  15. Monitoring – Web Page Interface • to ensure CDF production farm runs smoothly the hardware performance including status reporting must be monitered • this is done using the production farm web interface (PFWI) • PFWI parses, calculates, and displays all major characteristics of the farm with online results Solomon Mikael

  16. Tikiwiki Contributions • In this page it shows the disk space on the 32 partitions on the different servers Edited pythons script df_disk.py 110 p = string.find(output, '/export/data4') 111usage = string.strip(output[p-24:p]) 112fp.write(""" fncdfsrv5 %20s /export/data4 """ % usage) 113fp.write("\n") 114percentage = string.strip(output[p-5:p-2]) 115 if string.atoi(percentage) > 90 : IsFull=1 Solomon Mikael

  17. Tiki Editor • Using the online tiki editor modifications were made to improve the functionality of the ProjectConfiguration page Solomon Mikael

  18. Summary • In these weeks: • Implemented improvements to Production Farm monitoring. • Participated in development of remote concatenation. • Acquired new skills • Learned about the physics inside the FermiLab laboratory Solomon Mikael

  19. Acknowledgements • SIST committee for giving me this opportunity • Elena Vataga & Pavel Murat • Ms. Engram & Dr. Elliott McCrory • Dr. Davenport & Jamieson Olsen Solomon Mikael

  20. BACKUP Solomon Mikael

  21. History of Production Farm • Fermilab has used clusters of processors to provide large computing power with dedicated processors like the Motoroloa 68030 • CDF Run 2 data was processed using the first developed Farm Processing System (FPS) using FBSNG batch system (1) • Farm Processing System was the software that managed, controlled, and monitered the CDF production farm from 1999-2005 Solomon Mikael

  22. Monitoring • PARSING - this layer access MySQL or CAF output files and after processing text and performing calculations the data is fed to cache layer • CACHE – this layer does statistical preprocessing and has an interface to easily visualize the data. The data is then stored. (1) • WEB – displays all the information collected by the parseres and gathers data not needing pre-processing. -- Uses PHP4 to generaet the web pages. • Python tiki Solomon Mikael

More Related