280 likes | 419 Views
Maintaining Large Vista Installations. Amy Edwards, Ezra Freelove, & George Hernandez July 12, 2007. Agenda. Comparisons Who is USG Automation Monitoring Maintenance More Tricks Questions?. (All prod clusters) now: 1-10 11-20 21-50 50-70 70+ Ours in bold.
E N D
Maintaining Large Vista Installations Amy Edwards, Ezra Freelove, & George Hernandez July 12, 2007
Agenda • Comparisons • Who is USG • Automation • Monitoring • Maintenance • More Tricks • Questions?
(All prod clusters) now: 1-10 11-20 21-50 50-70 70+ Ours in bold (All prod clusters) by December: 1-10 11-20 21-50 50-70 70+ Informal Poll - Number of nodes
Informal Poll – Number of DB Instances Including secondary and non-production • 1-2 • 3-6 • 7-10 • 10+ • Ours in bold
GeorgiaVIEW Project • University System of Georgia (USG) • Vista 3.0.7 • Host 32 institutions & multiple consortial programs • >150,000 active students • Active is 100+ actions • >11,000 active sections / term
Issues • Handling performance issues • Capacity planning • Upgrades • Replication • JMS sensitivity • Integration
Automation • Rolling Restarts • Managed nodes restarted weekly • except JMS • Log cleanup to preserve space • Error reporting • application, tracking, vulnerabilities • Thread dumps • Sync admin node with backup • LDIS batch integration
Monitoring • Nagios • http://www.nagios.org/ • Sends alerts • Stats • Custom AJAX web app • Watch changes of over time • AWStats • http://www.awstats.org/
OS / Hardware Load Temperature Free space Database Tablespace free space Listener Oracle processes Application Direct-login Weblogic processes Java MBeans Default/Primary Pending Requests Current Count Java Heap Current JDBC Waiting for Connection Current Count Multicast Messages Lost Primary count Nagios Monitors
Stats • Short and long term analysis • 21 months of data • Graphs all Nagios data collected • Flexible creation of reports • Built with AJAX
AWStats • Records data from web server logs • Custom script grabs data from webserver.log files • Runs daily
Specialized Nodes • Admin • JMS • Institutional Admin • Integration • Chat
JMS Node • Provides special services • Mail, LC creation, chat • Failure or migration of JMS node hinders usage • Services do not migrate well • Allow targeted migration • OTHERS: Pin JMS to a specific node
Integration • Batched LDIS data files • Cron runs nightly • Files broken up by: • type • “reasonable” number of records • Done on Inst node • Issues with import can kill node
Touching Nodes • ssh & dsh • Touch groups of nodes at once • Useful for: • Installs • Gathering logs • Locating a session
Maintenance Page • Hosted on opposite f5 • Two versions • Scheduled maintenance • Unscheduled outage • In an f5 outage, move DNS to other f5 so message still appears
Installs and Upgrades • Silent install scripts • Test in both development environments • Create against a small database • Get results of time to complete against a full size copy of production • Install to production
Powerlinks and Custom Development • Test in development • Try to break • Pilot in production • Release to all
Want More? • To view my resources and references for this presentation, visit www.scholar.com • Simply click “Advanced Search” and search by ezrafreelove and tag: ‘bbworld07’
Contact Information • Ezra Freelove ezra.freelove@usg.edu • Amy Edwardsamy.edwards@usg.edu • George Hernandezgeorge.hernandez@usg.edu