240 likes | 338 Views
Blue Gene Bring Up. Linux on Service Node. SuSE SLES 10 A RAID array is recommended, typically either RAID1 or RAID5 depending on the number of disks available. Either 1 or 2 volume groups depending on the disk configuration (rootvg and datavg). Linux on Service Node. Partitions
E N D
Linux on Service Node • SuSE SLES 10 • A RAID array is recommended, typically either RAID1 or RAID5 depending on the number of disks available. • Either 1 or 2 volume groups depending on the disk configuration (rootvg and datavg).
Linux on Service Node • Partitions • / - 1 GB - rootlv • /usr - 3 GB - usrlv • /var - 2 GB – varlv • /opt - 10 GB – optlv • /tmp - 10 GB – tmplv • swap - 4 GB - swap - swaplv • /dbhome - 20GB - dbhomelv • /bgsys - 10GB – bgsyslv
Linux on Service Node • RPMs • cpp, gcc, libgcc, gcc-c++, gcc-64bit, glibc-devel, libgcc-64bit, bison, texinfo, flex, termcap, termcap-64bit, gcc-fortran, gmp, gmp-64bit, gmp-devel, gmp-devel-64bit, ncurses-devel, ncurses-devel-64bit • vacpp.rte-8.0.1-2.ppc64.rpm xlsmp.rte-1.6.1-3.ppc64.rpm xlsmp.msg.rte-1.6.1-3.ppc64.rpm • bgp_os, bgp_base, bgptoolchain • Interfaces • Functional network • Service network • Public network
Groups db2rasdb db2iadm1 db2fadm1 db2asgrp Users bgpsysdb bgpdb2c bgpadmin mpirun NFS IONodes mount /bgsys to finish their boot process, as such /bgsys is exported on the functional network via NFS bgpuser bgpdeveloper bgpadmin bgpservice Linux on Service Node
Front End Node • Groups • bgpadmin • bgpservice • bgpdeveloper • bgpuser • Users • mpirun • Profile • /etc/profile.d/bgp.sh
DB2 - Why use a Database? • Need a software representation of the hardware • A machine of such large scale requires a persistent means of storing errors (RAS events), job history, block definitions, environmental readings, etc. • Operational state of the machine can be obtained without touching the hardware
Other Benefits of a Database • Setting values in the database can trigger actions in other components • Can simplify the design by having policy stored in the database itself via procedures, triggers, and constraints instead of the code • Information can be obtained using existing tools or SQL
DB2 • Product Description • Restricted license • Enterprise Server Edition (ESE) • Client • Database Location • /dbhome/bgpsysdb • Instances • bgpsysdb (server) • bgpdb2c (client)
DB2 concepts • SchemaThe collection of database objects such as tables, views, indexes, and triggers that define the database. • TablesA named data object that consists of a specific number of columns and some unordered rows. • ViewsA logical table that consists of data that a query generates.
DB2 Naming Guidelines for BG/P • Tables always start with TBGP, such as TBGPNodeCard, or TBGPLinkCard • Names are NOT case sensitive in SQL • For each of the tables, there is a view that has the more user-friendly columns, such as location, and without VPD • These are named without the T, such as BGPNodeCard • In cases where some information is omitted from the view, there is also an extra view for diags, such as BGPNodeCardAll • If there is no need for any derived columns in the view, or omitted columns, then an alias is created • i.e. BGPClockCard • The net effect is that almost all the time, using the “BGP” name will show you what you want • If there is a history being kept, then _history is added to the end
TBGPBlock TBGPBPBlockMap TBGPSmallBlock TBGPLinkBlockMap TBGPProductType TBGPMachine TBGPMachineSubnet TBGPMidplane TBGPNodeCard TBGPNode TBGPServiceCard TBGPLinkCard TBGPClockCard TBGPBulkPowerSupply TBGPSwitch TBGPCable TBGPClockCable TBGPLinkChip TBGPICON TBGPFanModule TBGPJob TBGPEthGateway TBGPEGWMachineMap TBGPPortBlockMap TBGPBlockUsers TBGPMidplaneSubnet TBGPNodeSubnet TBGPServiceAction TBGPUserPrefs TBGPReplacement_history TBGPMachine_history TBGPMidplane_history TBGPNodeCard_history TBGPNode_history TBGPServiceCard_history TBGPLinkCard_history TBGPClockCard_history TBGPLinkChip_history TBGPIcon_history TBGPFanModule_history TBGPJob_history TBGPServiceCardEnvironment TBGPFanEnvironment TBGPClockCardEnvironment TBGPBULKPOWEREnvironment TBGPNodeCardPOWEREnvironment TBGPLinkCardPOWEREnvironment TBGPSrvcCardPOWEREnvironment TBGPLinkChipEnvironment TBGPLinkCardEnvironment TBGPNodeEnvironment TBGPNodeCardEnvironment TBGPEventLog TBGPERRCodes TBGPDiagRuns TBGPDiagBlocks TBGPDiagResults TBGPDiagTests BG/P Tables
BGPMidplane BGPMidplaneAll BGPNodeCard BGPNodeCardAll BGPNode BGPNodeAll BGPServiceCard BGPServiceCardAll BGPLinkCard BGPLinkCardAll BGPClockCardAll BGPBulkPowerSupplyAll BGPLinkChip BGPLinkChipAll BGPFanModule BGPFanModuleAll BGPLink BGPClockCardEnvironment BGPDiagTests BGPNodeCardCount BGPLinkCardCount BGPServiceCardCount BGPNodeCount BGPBasePartition BGPBPBlockStatus BGPSwitchLinks BGPLinkBlockStatus BGPSwitchPort BGPPortBlockStatus BGPBlockSize BG/P Views
Database setup • Database Populate This is a Perl script that populates the database with the expected configuration for the Blue Gene system. • InstallServiceAction Verifies that the predefined structure matches the actual configuration • VerifyCables Confirms that the torus network cabling is correct • VerifyIpAddressesConfirms that the IO card IP addresses are correct
DB2/SQL examples • List all tables/views list tables • Describe table/view describe table TBGPmidplane • Extracting data select * from TBGPmidplane More complex select a.position,count(isionode), a.status, a.seqid from tbgpnodecard a left outer join bgpnode b on b.midplanepos = a.midplanepos and b.nodecardpos = a.position and b.isionode = 'T' and b.status <>'M' where a.midplanepos = ‘R00-M0' group by a.position,a.status,a.seqid order by 1
Exercise • Logon to service node as bgpadmin • db2 conect to bgdb0 user bgpsysdb • List tables in the database • List the serial numbers of the nodecards • List only the compute cards
BGP RPMs • RPMs • bgp_os • bgpbase • bgptoolchain • Directory tree • /bgsys • /bgsys/drivers/ppcfloor – symbolic link to current driver sw • /bysys/drivers/ppcfloor/bin - binaries • /bgsys/drivers/ppcfloor/bareMetal – service actions scripts
Site Specific Configuration • Templates are located in /bgsys/local/etc • rc scripts • UIDs and GIDs • profiles • /etc/profile.d/bgp.sh
Shutdown • Run a service action on the clock cards in each rack: tertiary, secondary, primary clock cards • ‘bgpmaster stop’ • stop db2 • Power down rack(s) • Shutdown FEN • Shutdown service node
Startup • Service node • Front end node • Power up racks • ‘bgpmaster start’ • End service actions on clock cards (primary, secondary, tertiary) • Verify all hardware is seen
Unexpected Power Outage • Power off all systems • Power up and boot service node • Power up and boot FEN • Power up rack(s) • ‘bgpmaster start’ • Run install service action
Exercise • Shutdown and startup system • Verify all is well