360 likes | 375 Views
Learn about the computing resources available for particle physics research, including connecting remotely, accessing storage, and recommended working strategies.
E N D
Oxford University Particle Physics Unix Overview Pete Gronbech Particle Physics Senior Systems Manager & GridPP Project Manager Graduate Lectures
Graduate Lectures Strategy Connecting (RDP) and accessing storage Local Cluster Overview Computer Rooms Grid Cluster Connecting with ssh Other resources How to get help More on the Grid & getting a Grid certificate
Particle Physics Strategy The Server / Desktop Divide Virtual Machine Host Servers General Purpose Unix Server Linux Worker nodes Group DAQ Systems Linux FileServers Web Server NIS Server torque Server Win 7 PC Ubuntu PC Win 7 PC laptop Linux Desktop Clients Graduate Lectures
Distributed Model Graduate Lectures Files are not stored on the machine you are using, but on remote locations You can run a different operating system by making a remote connection to it
Recommended working strategy Graduate Lectures • Computing work splits broadly into • Office work (local/remote) • Writing code (remote) • Computation (remote) • Use your favourite desktop/laptop for office software. • Make a remote desktop connection to pplxint8/9to do computing work on the batch farm • It is better to write codeon the remote Linux servers
Physics Remote desktops Graduate Lectures We use RDP for remote desktop This means that everyone in particle physics has access to multiple desktop environments from anywhere Windows (termserv.physics.ox.ac.uk) Scientific Linux (pplxint8/9) Ubuntu Linux (ubuntu-trusty-ts) MaxOSX (osxts via vnc)
Physics storage Windows server Central Linux file-server PP file-server Storage system: GraduateLectures
To store files on servers using your laptop Graduate Lectures • Windows: map your H:\ drive by typing • net use H: https://winfe.physics.ox.ac.uk/home/yourname /USER:yourname - Where yourname is your physics user name • OSX: http://www2.physics.ox.ac.uk/it-services/connecting-to-physics-file-servers-from-os-x • Linux: http://www2.physics.ox.ac.uk/it-services/access-windows-shares-from-linux
RDP and storage demo H:\ drive on windows Connecting to Linux on pplxint8 from windows /home and /data on Linux Graduate Lectures
Particle Physics Linux Graduate Lectures • Unix Team (Room 661): • Vipul Davda – Grid and Local Support • Kashif Mohammad – Grid and Local Support • Pete Gronbech - Senior Systems Manager and GridPP Project Manager • General purpose interactive Linux based systems for code development, short tests and access to Linux based office applications. These are accessed remotely. • Batch queues are provided for longer and intensive jobs. Provisioned to meet peak demand and give a fast turnaround for final analysis. • Our Local Systems run Scientific Linux (SL) which is a free Red Hat Enterprise based distribution. • The same as the Grid and CERN • We will be able to offer you the most help running your code on SL6. • Will move to CentOS7 to match the Grid and collaborators as required.
Current Clusters Graduate Lectures Particle Physics Local Batch cluster Oxford’s Tier 2 Grid cluster
PP Linux Batch Farm Scientific Linux 6 Users log in to the interactive nodes pplxint8 & 9, the home directories and all the data disks (/home area or /data/group ) are shared across the cluster and visible on the interactive machines and all the batch system worker nodes. Approximately 600 cores (incl. 128 cores for JAI/LWFA), each with 4GB of RAM memory. The /home area is where you should keep your important text files such as source code, papers and thesis The/data/ area is where you should put your big reproducible input and output data jailxwn02 64 * AMD cores jailxwn01 64 * AMD cores pplxwn68 16 * Intel cores pplxwn67 16 * Intel cores pplxwnnn 16 * Intel 2650 cores pplxwn41 16 * Intel 2650 cores pplxwn32 12 * Intel 5650 cores pplxwn31 12 * Intel 5650 cores Grid data transfer nodes pplxdatatrans pplxint9 Interactive login nodes pplxint8 Graduate Lectures
PP Linux Batch Farm - Development Scientific Linux 7 Test interactive CentOS7 machine with new Condor Batch system. A couple of test worker nodes are attached. If your code requires OS7 (ie Red Hat Enterprise Linux 7, Scientific Linux 7 or CentOS 7 …..) please let us know how you get on testing on this machine. pplxwnnn 16 * Intel 2650 cores pplxwn 16 * Intel 2650 cores Interactive login node pplxint10 Graduate Lectures
PP Linux Batch Farm Data Storage NFS, Lustre & Glustre storage Servers 40TB The /data areas are big and fast disks. This is too big to be backed-up but still have some redundancy features and are safer than laptop storage. This does not help if you delete files. (Dual PSU, + RAID 6) Data Areas pplxfsn 40TB The /home areas are backed up by two different systems nightly. The latest nightly backup of any lost or deleted files from your home directory is available at the read-only location /data/homebackup/{username} If you need older files, tell us If you need more space on /home, tell us Store your thesis on /home NOT /data. Data Areas pplxfsn 30TB Data Areas pplxfsn 19TB Home areas pplxfsn Graduate Lectures
Lustre MDS Lustre OSS01 Lustre OSS02 Lustre OSS04 Lustre OSS03 18TB 44TB 44TB 18TB SL6 Node SL6 Node SL6 Node SL6 Node Particle Physics Computing Distributed file systems are used to group multiple file servers together to provide extremely large continuous file spaces. This type of filesystem is used for the Atlas and LHCb groups. We started with lustre but are now migrating to gluster. Atlas completed , LHCb in progress. df -h /data/atlas Filesystem Size Used Avail Use% Mounted on /gluster/atlasgl/atlas 364T 327T 38T 90% /data/atlas df -h /data/lhcb Filesystem Size Used Avail Use% Mounted on /lustre/lhcb25 198T 164T 25T 88% /data/lhcb Graduate Lectures
Local Oxford DWB Physics Infrastructure Computer Room The computer room on Level 1 of DWB has 100KW cooling and >200KW power has been built. Local Physics department Infrastructure computer room. Completed September 2007. This allowed local computer rooms to be refurbished as offices again and racks that were in unsuitable locations to be re housed. Graduate Lectures
Begbroke Computer Room The Computer room built at Begbroke Science Park jointly for the Oxford Super Computer (Advanced Research Computing) and the Physics department, provides space for 55 (11KW) computer racks. 22 of which will be for Physics. Up to a third of these can be used for the Tier 2 centre. This £1.5M project was funded by SRIF. The room was ready in December 2007. Oxford Tier 2 Grid cluster was moved there during spring 2008. All new Physics High Performance Clusters will be installed here. Graduate Lectures
Recent Grid CPU Upgrades • Lenovo NeXtScale • March 2016 - 25 Nodes each with Dual E5-2640 v3 & 64GB RAM 800 new cores • Feb 2017 - Extra 11 nodes, E5-2620 v4, 352 cores • Oxford Tier-2 total ~2700 cores Graduate Lectures
Can be part of a much bigger cluster Graduate Lectures
Grid Cluster – High Utilisation Local Cluster – Bursty Utilisation Graduate Lectures
Strong Passwords etc Graduate Lectures • Use a strong password not open to dictionary attack! • fred123 – No good • Uaspnotda!09 – Much better • More convenient to use ssh with a passphrased key stored on your desktop. • Once set up
Connecting with PuTTY to Linux Graduate Lectures Demo • Plain ssh terminal connection • From ‘outside of physics’ • From Office (no password) • ssh with X windows tunnelled to passive exceed. Single apps. • Password-less access from ‘outside physics’ http://www2.physics.ox.ac.uk/it-services/ppunix/ppunix-cluster http://www.howtoforge.com/ssh_key_based_logins_putty
Puttygen to create an ssh key on Windows (previous slide point #3) Paste this into ~/.ssh/authorized_keys on pplxint Enter a secure passphrase then : - Enter a strong passphrase - Save the private parts of the key to a subdirectory of your local drive. Graduate Lectures
Pageant Graduate Lectures Run Pageant once after login Right-click on the pageant symbol and and “Add key” for your Private (windows ssh key)
Other resources (for free) Graduate Lectures • Oxford Advanced Research Computing • A shared cluster of CPU nodes, “just” like the local cluster here • GPU nodes • Faster for ‘fitting’, toy studies and MC generation • *IFF* code is written in a way that supports them • Moderate disk space allowance per experiment (<5TB) • http://www.arc.ox.ac.uk/content/getting-started • The Grid • Massive globally connected computer farm • For big computing projects • Atlas, LHCb, t2k and SNO already make use of the Grid. • Come talk to us in Room 661
The end of the overview Graduate Lectures • Now more details of use of the clusters • Help Pages • http://www.physics.ox.ac.uk/it-services/unix-systems • http://www2.physics.ox.ac.uk/research/particle-physics/particle-physics-computer-support • ARC • http://www.arc.ox.ac.uk/content/getting-started • Email • pp_unix_admin@physics.ox.ac.uk • GRID talk at the end
GRID certificates Atlas, Sno, LHCb and t2k in particular Graduate Lectures
SouthGrid Member Institutions • Oxford • RAL PPD • Cambridge • Birmingham • Bristol • Sussex Graduate Lectures
Current capacity Graduate Lectures • Compute Servers • Twin and twin squared nodes • 2700 CPU cores • Storage • Total of ~1000TB • The servers have between 12 and 36 disks, the more recent ones are 4TB capacity each. These use hardware RAID and UPS to provide resilience.
You will then need to contact central Oxford IT. They will need to see you, with your university card, to approve your request: To: help@it.ox.ac.uk Dear Stuart Robeson and Jackie Hewitt, Please let me know a good time to come over to Banbury road IT office for you to approve my grid certificate request. Thanks. Get a Grid Certificate Must remember to use the same PC to request and retrieve the Grid Certificate. The new UKCA page http://www.ngs.ac.uk/ukca Graduate Lectures
When you have your grid certificate… Save to a filename in your home directory on the Linux systems, eg: Y:\Linuxusers\particle\home\{username}\mycert.p12 Graduate Lectures Log in to pplxint9 and run mkdir .globus chmod 700 .globus cd .globus openssl pkcs12 -in ../mycert.p12 -clcerts -nokeys -out usercert.pem openssl pkcs12 -in ../mycert.p12 -nocerts -out userkey.pem chmod 400 userkey.pem chmod 444 usercert.pem
Now Join a VO Graduate Lectures • This is the Virtual Organisation such as “Atlas”, so: • You are allowed to submit jobs using the infrastructure of the experiment • Access data for the experiment • Speak to your colleagues on the experiment about this. It is a different process for every experiment!
Joining a VO Graduate Lectures Your grid certificate identifies you to the grid as an individual user, but it's not enough on its own to allow you to run jobs; you also need to join a Virtual Organisation (VO). These are essentially just user groups, typically one per experiment, and individual grid sites can choose to support (or not) work by users of a particular VO. Most sites support the four LHC VOs, fewer support the smaller experiments. The sign-up procedures vary from VO to VO, UK ones typically require a manual approval step, LHC ones require an active CERN account. For anyone that's interested in using the grid, but is not working on an experiment with an existing VO, we have a local VO we can use to get you started.
When that’s done Graduate Lectures Test your grid certificate: > voms-proxy-init–vomslhcb.cern.ch Enter GRID pass phrase: Your identity: /C=UK/O=eScience/OU=Oxford/L=OeSC/CN=j bloggs Creating temporary proxy ..................................... Done Consult the documentation provided by your experiment for ‘their’ way to submit and manage grid jobs
Questions? Graduate Lectures