1 / 19

Go beyond debug Wire Tap your App for knowlege

Go beyond debug Wire Tap your App for knowlege. with Hadoop. Tom McCuch Solution Engineering @ Hortonworks Twitter: tmccuch Oleg Zhurakousky Principal Architect @ Hortonworks Twitter: z_oleg. The Application Development Dilemma.

malaya
Download Presentation

Go beyond debug Wire Tap your App for knowlege

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Go beyond debugWire Tap your App for knowlege with Hadoop Tom McCuch Solution Engineering @ Hortonworks Twitter: tmccuch Oleg Zhurakousky Principal Architect @ Hortonworks Twitter: z_oleg

  2. The Application Development Dilemma • Today, application developers devote roughly 80% of their code to persisting roughly 20% of the total data flowing through their applications • 80% of the data flowing through our applications is at best lost in rolling log files, at worst never collected -- without ever being analyzed or accounted for • For the remaining 20% we do currently collect – application-level database programming, licensing, storage, administration, and ETL processing have maxed out IT operations budgets and have constrained app development teams from keeping pace with the rate of change in the business

  3. Example: Data Available During Ingest • Record count • Highest/Lowest record length • Average record length • Compression ratio But with a little more work. . . • Field parsing • Unique values • Unique values per field • Access to values of each field independently from the record • Relatively fast field-based searches, without indexing • Value encoding • Etc… These are cross-cutting concerns!

  4. How do we address cross-cutting concerns without disturbing the existing process flow?

  5. Wire Tap Defined

  6. Wire Tap is an Enterprise Integration Pattern

  7. Other Enterprise Integration Patterns • Transformer Convert payload or modify headers • Filter Discard messages based on boolean evaluation • Router Determine next channel based on content • Splitter Generate multiple messages from one • Aggregator Assemble a single message from multiple

  8. The Business Case

  9. 6 Key Hadoop DATA TYPES • SentimentUnderstand how your customers feel about your brand and products – right now • ClickstreamCapture and analyze website visitors’ data trails and optimize your website • Sensor/MachineDiscover patterns in data streaming automatically from remote sensors and machines • GeographicAnalyze location-based data to manage operations where they occur • Server LogsResearch logs to diagnose process failures and prevent security breaches • TextUnderstand patterns in text across millions of web pages, emails, and documents Value

  10. 20 Apache Hadoop Enterprise Use Cases

  11. Financial Services Data: Server Logs Fraud Prevention Business Problem • Financial institutions are always at risk of fraud • Fraudsters test bank systems for vulnerabilities • This testing leaves subtle patterns often undetected by bank employees or law enforcement • Fraud losses costs banks millions Solution • HDP reduces the cost to detect fraudulent activity • HDP stores more types of data for longer • Analysis of data in the “data lake” exposes fraudulent patterns that would have gone undetected

  12. Credit Request Process Flow - Before Credit Request Processing • Credit Request arrives on a Gateway • Credit Request is sent over a Channel • Credit Request Processor • Receives Request • Processes the Request • Issues a Response

  13. Cross-Cutting Concerns • Credit Scoring • Fraud Detection • Gathering Data Available during Credit Request Process Flow

  14. Demo

  15. Credit Request Processing Flow - After HDP

  16. Example: HTTP Header Collection

  17. Example: Data Available During Ingest • Record count • Highest/Lowest record length • Average record length • Compression ratio But with a little more work. . . • Field parsing - unstructured data is not all that unstructured… • Unique values • Unique values per field • Access to values of each field independently from the record • Relatively fast field-based searches, without indexing • Value encoding • Etc… These are cross-cutting concerns!

  18. Demo

  19. Thank You! Questions & Answers Follow:@tmccuch, @z_oleg, @hortonworks

More Related