220 likes | 376 Views
London Appdynamics User Group (LAUG) February 2013 Meetup Presented by IG. How AppDynamics is influencing our IT culture. Hamed Silatani Augusto Rodriguez. Contents. Why is performance important to us? How we measured performance historically How AppDynamics is influencing positively:
E N D
London Appdynamics User Group (LAUG)February 2013 MeetupPresented by IG
How AppDynamics isinfluencing our IT culture • Hamed Silatani • Augusto Rodriguez
Contents • Why is performance important to us? • How we measured performance historically • How AppDynamics is influencing positively: • Development • Architecture • QA • Operations and Support • Successes and challenges we found
IG’s context • We offer Spread Betting and CFD trading • Trading and price updates are time critical – Every millisecond counts • Using AppDyanmicsin production for 4 months
Development process • Proactive than reactive approach to performance: • Identifying latency issues in production is too late • Mostly on trading and charting platform • We try to improve the performance on each release Response time
Development Process • Thinking about monitoring from the start • Influences how we design • It was expensive and focused • Ad-hoc classes to wrap specific transactions and log metrics. • Creating custom metric collectors and graphing them. • Not possible to have it everywhere
Development Process • Leave metric collection to its experts and focus on our business domainand differentiators.
QA • Link Business Transactions to Services • Simpler option to sign-off architectural changes • Provide snapshots on bug reports when a service returns an error.
QA • Helps with regression: Business transaction health view easily tells us: • What transactions are fundamentally broken • Can’t tell if all transactions are OK
Operations & Support • Lower the bar to find performance issues • Easier to collaborate with other teams to solve the problem: • focus on the solution. • With DBAs - > DB call times • can pin point performance bottlenecks: • Bottlenecks caused by downstream components • Inefficient code in unfamiliar parts of the platform • Thread hogging calling a SAAS provider • troubleshooting integration with 3rd party software. • Messaging broker
Operations & Support • BTs enables us to correlate exceptions across nodes
Operations & Support • Ability to correlate events (cluster, nodes, etc).[gc , cpu]
Operations & Support • Reduce the number of false positives alerts (Correlation of metrics for alerting).
Operations & Support • Inventory of JVMs and config (* run reports on jvms)
Operational successes • Datacenter failover • Already used to improve our throughput
Operational challenges • No way to promote config changes through environments • Changes to BTs or metric names invalidates dashboards • Invest in training to get the most out of it • To get best of the tool work with Appdengineers
Going forward • Doing end user monitoring (web and mobile) • Collect real-time business metrics and KPIs • Use events to mark application version changes • Diff flowmaps • Compare platform performance across all nodes between load tests