500 likes | 758 Views
Take Your Oracle WebLogic Applications to The Next Level with Oracle Enterprise Manager 12 c. Mojahedul Hoque Abul Hasanat CTO, Therap Services. Neelima Bawa Consulting Tech. Lead, SCP, EM, Oracle. Agenda. Background of Therap Services The problem Application Performance Management
E N D
Take Your Oracle WebLogic Applications to The Next Level with Oracle Enterprise Manager 12c Mojahedul Hoque Abul Hasanat CTO, Therap Services Neelima Bawa Consulting Tech. Lead, SCP, EM, Oracle
Therap Services - OOW 2013 Agenda • Background of Therap Services • The problem • Application Performance Management • Quick description of Oracle’s APM offering • How OEM and RUEI helped us • Actual scenarios • The future • Tips • Q&A
Therap Services - OOW 2013 Therap Services, LLC • Documentation and Communication Software for MR/DD • EHR for the DD industry is the closest for describing us • Niche segment in the health sector • Improve quality of life for people with DD by improving efficiency of delivery through communication • SaaS business model • 150K+ active users • 1000+ providers in 48 states • State customers • Extensive usage for DD in DHS ND and DHHS NE • 150+ employees • Based in CT, dev center in Bangladesh • http://www.therapservices.net
Therap Services - OOW 2013 The Application • The application is our business • 1M+ lines of code • 60+ modules • 1M+ sustained HTTP requests/hour • 30K+ peak requests/minute • 6000+ concurrent users • Based on JEE and the Spring Framework • Hibernate • Seam • GRAILS
Therap Services - OOW 2013 Delivery Platform • 2 identical sites in two states • Primary hosts (per site): • 4 WebLogic application servers in cluster • 1 Memory based data server (in-house, java) • 1 Oracle database server • 1 NetApp storage (SAN) • 1 F5 Load balancer • Supporting hosts • Use Dyn for site high availability • Data replication with Oracle Golden Gate
Therap Services - OOW 2013 What Matters • Availability • Application is used 24x7 • Application use is critical to the business of our customers • Performance • A user needs to spend as little time as possible in our application • Most users use it daily, multiple times • Data integrity • Fast development turnaround
Therap Services - OOW 2013 Evolution of Therap • Improved testing • Formal code review • Improved processes • Removed repeating problems • Now all problems we face are new • Acquired large customers • Availability and performance have become critical factors
Therap Services - OOW 2013 Before OEM & RUEI • Heavy use of logging • Nagios • Cacti • kill -3
Therap Services - OOW 2013 The Problem • Diagnosis of performance issues • Has become much harder with the growing system • Application availability • With a larger customer base, uptime has become a major factor • Complexity of the system increases difficulty • Limits of logging • Works for known unknowns • Need infrastructure to visualize and store historic data • Limits of OS based monitoring • Limited metrics • Limits of simple JMX monitoring
Therap Services - OOW 2013 Application Performance Management • Deep insight into running application • Profiling at runtime • Some bottlenecks are only visible at runtime • Historic data • Invaluable for preventing performance regression
Therap Services - OOW 2013 Which Vendor? • We were moving from JBoss to WebLogic • JVM Diagnostics • Extensive WebLogic metrics • Probably the best database diagnostics • Our team was already familiar with OEM • Deep integration with the database • Integration of JVMD and app server metrics • Expect better support from Oracle
Therap Services - OOW 2013 Oracle’s APM • Oracle Enterprise Manager 12c • WebLogic Metrics • Middleware Diagnostics Advisor • JVM Diagnostics • Configuration Management • Incident Management • Lifecycle Management • Oracle Real User Experience Insight • Oracle Business Transaction Management
Therap Services - OOW 2013 WebLogic Metrics • Pro-active monitoring • Helps us in avoiding downtime • Correlation between various metrics • Middleware Diagnostics Advisor
Therap Services - OOW 2013 JVM Diagnostics • Deep insight into the JVM • Invaluable for understanding application performance issues • Helped us in identifying log4j bottleneck • Early identification of performance problems
Therap Services - OOW 2013 Oracle Real User Experience Insight • Measure performance seen from the customer end • Detect performance regression • Enables shorter release cycles • Quick and real feedback for performance tuning operations
Therap Services - OOW 2013 Our Journey
Therap Services - OOW 2013 Timeline • Evaluation of various vendors – July 2012 • Purchase of WebLogic, OEM, RUEI – Nov 2012 • Start JBoss to WebLogic migration • Start building expertise on OEM • Start using OEM in test environment • Fix problems found through OEM, JVMD • Production deployment – Mar 2013
Therap Services - OOW 2013 How OEM and RUEI helped us
Therap Services - OOW 2013 The log4j bottleneck • During load testing, we could not increase load beyond a certain point • CPU load was low • JVMD showed us something that we could hardly believe • Many threads were contending for lock for writing to the log file • The contention only shows up at high loads • Used JVMD heavily to find the best logging backend and the best configuration
Therap Services - OOW 2013 log4j…
Therap Services - OOW 2013 log4j…
Therap Services - OOW 2013 log4j...
Therap Services - OOW 2013 EJB Transaction Optimization • Noticed abnormally high number of bean transaction commits • We had forgotten to optimize some frequently used EJB • Read-only methods do not need to be transactional
Therap Services - OOW 2013 EJB Transaction Optimization…
Therap Services - OOW 2013 Unexpected Top Method • Noticed a JMS listener in the top method list • In production! • Did not show up during synthetic load testing • We forgot to add a “message selector” on the listener
Therap Services - OOW 2013 Top Method…
Therap Services - OOW 2013 The MDA Catch • MDA reported an unexpected “The EJB is taking too long to execute” • Related method was showing in the top methods list • There were extraneous calls to the EJB • The method did not need to be in an EJB
Therap Services - OOW 2013 The MDA Catch…
Therap Services - OOW 2013 The MDA Catch…
Therap Services - OOW 2013 The Slow Library • A library call for producing JSON showed on the top method list • JSON is needed for AJAX • It was totally unexpected • The library was old and inefficient • Replaced it with a newer and more efficient library
Therap Services - OOW 2013 The Slow Library…
Therap Services - OOW 2013 The Slow Library…
Therap Services - OOW 2013 The Slow Library…
Therap Services - OOW 2013 In-efficient Network Write • Initially discovered in production through JVMD • There were instances of high network waits • Methods a certain module in the application showed up in the top list during the high network wait periods • Discovered a 3 level loop that writes data • Further inspection through JProfiler confirmed it
Therap Services - OOW 2013 In-efficient Network Write…
Therap Services - OOW 2013 In-efficient Network Write…
Therap Services - OOW 2013 Automatic Thread Snapshots • Previously, relied on kill -3 • Manual, missed dumps at crucial moments • Now, JVMD takes thread snapshots when an abnormal thread state is reached on any WebLogic server • Combined with auto-restart from WebLogic, eliminated unplanned downtime
Therap Services - OOW 2013 Compliance • Helps us identify patches needed for: • WebLogic • Oracle Database • OEM • Downtime log • Useful for tracking operations improvement
Therap Services - OOW 2013 The JDK Upgrade • Upgraded JDK from 1.6.0_29 to 1.6.0_45 • At that time, we have not brought RUEI into our regular operations process • Started getting slowness complaints after a few days • A look into RUEI instantly revealed performance regression • Downgrading the JDK fixed the regression completely
Therap Services - OOW 2013 The JDK Upgrade…
Therap Services - OOW 2013 The JDK Upgrade… • We could not reproduce the performance regression in lab environment • We now have a new procedure to upgrade JDK • Do normal tests and load tests as before • In production, upgrade the JDK of one server only • Wait a few days • Compare performance • Decide whether to upgrade or rollback
Therap Services - OOW 2013 The Results • 1 unplanned downtime in the last 4 months! • Improving DevOps culture • Foster collaboration between dev, ops and DBA • No more fighting between dev and DBA • With RUEI, even the business team joins the fun
Therap Services - OOW 2013 Most Importantly I can do this!
Therap Services - OOW 2013 Future with OEM and RUEI
Therap Services - OOW 2013 Future • Involve more people • Use KPI in RUEI • Integrate KPI with OEM by defining a Business Application • Leverage alerting and incident management in OEM
Therap Services - OOW 2013 Challenges • Steep learning curve • Took us time to understand the role of BTM and JRF • Better documentation available now • Lack of full WebLogic 12c support when we started • Should be solved in latest release of OEM • Support has been fanatical! • Thank you Oracle!
Therap Services - OOW 2013 Tips • Understand what you need • Any web app with performance requirements needs RUEI • If you face performance issues, you will need JVMD • Engage as many people in your team as possible • Give wide access to many • The learning curve • The problems these tools help you with are also complex
Therap Services - OOW 2013 Contact masum@therapservices.net @Masum6 neelima.bawa@oracle.com http://neelimabawa.blogspot.com