250 likes | 401 Views
Leveraging BIG Data for Competitive Advantage May 15, 2013. Big Data Lessons. The business landscape is being shaped by data as never before. Sheer magnitude of data being produced is staggering. Economist Intelligence Unit sought insight on this issue and more.
E N D
Big Data Lessons • The business landscape is being shaped by data as never before. • Sheer magnitude of data being produced is staggering. • Economist Intelligence Unit sought insight on this issue and more. • In a survey sponsored by SAS, 752 senior executives from a broad range of sectors and countries shared their thoughts on the world of data. • In parallel, interviews were conducted with 17 executives, consultants and specialists who are regarded as data pioneers. • Highlights of the research are as follows. • There is a strong link between financial performance and effective use of big data • Companies become successful at exploiting data by focusing on business priorities • Social media analytics and web-tracking technologies can transform the way businesses collect data about customers • Talent matters as much as technology
Data And The Bottom Line • Executives from a diverse range of sectors, including education and public services, say that their organization plans to or is already collecting many types of data. • High-performing companies are more in touch with data than their less-successful rivals. • Being able to do things in real time makes people think differently about the problem.
Putting Strategy First • It is very common to find data in disparate silos inside organizations, with a strong degree of territorial or technical boundaries around the data. • Even seasoned data professionals can find the world of big data overwhelming. • A company might be collecting market research interviews, a stream of information from social networks, supply chain data and sales figures from multiple sites. • Which source is the most important? • How can they be combined to maximum effect?
Talent, Not Just Technology • Companies need to ensure that data-driven thinking is not confined to the IT department. • It is about how you use data and add a layer of interpretation to it in order to get to the answer that you are looking for. • Skilled data scientists are in short supply and high demand. “This sophisticated kind of large data analytic work requires people who are not only capable, but desirous of this kind of work. There is a fairly limited category of professionals who get up in the morning wanting to go to work and do this kind of thing.” • The right person will possess knowledge of the sector in which that person’s company operates, as well as the skills required to work with large data sets. • Companies should also think hard about how to communicate the results they derive from their data work to employees with different needs and different levels of expertise.
When Do You Need A Big Data Solution? • Forrester says big data encompasses "techniques and technologies that make capturing value from data at an extreme scale economical”. • Volume of data combined with multiple disparate sources • Speed of processing needed • Big Data solutions are being leveraged to break existing processing bottlenecks • Data mining multiple sources • Analyze relationships across many multiple sources of data including structured data from existing legacy systems combined with unstructured data from crowd sourced information • High speed data analytics
Online Retail • Business Opportunity: Increase online sales • Key Performance Indicators: Decrease in walkaways • Big Data Attribute: Volume • What was done • Through large scale analysis of its clickstream data, Etsy is automatically discovering product attributes (things like materials, prices, or text features) which signal that a search result is particularly relevant (or irrelevant) to a given query. This attribute-level approach makes it possible to appropriately rank products in search results- even if those products are brand new and one-of-a-kind. • Result • Percentage of site visitors that purchased increased
Brick And Mortar Retail • Business Opportunity: Reduce loss to local competition • Key Performance Indicators: Regional price inelasticity • Big Data Attributes: Volume and Speed • What was done • Run a daily price check analysis of its 10,000 articles across 800 stores nationwide in less than two hours. This means: Whenever a neighboring competitor anywhere between New York and Los Angeles goes for aggressive price reductions, Macy's follows its example. If there is no market competitor, the prices remain unchanged. Thus there are around 270 million different prices across the entire range of goods and locations. The fact that this price analysis is possible at record speed – unthinkable before Big Data analysis . • Result • Regional demand for goods better met with less effect from regional competition
Banking • Business Opportunity: Reduce runtime for reports • Key Performance Indicators: Time to run reports • Big Data Attribute: Processing Speed • What was done • The company chose a system designed to provide a single view of business information so that decisions can be made based on up-to-date information. • Result • Head of IT at Deutsche Postbank, ClarelSookun, said: "The system has really proved its worth by reducing the time it takes to produce our reports for CRE from hours to minutes and they are 'accurate to the penny'. The system allows us to centrally store actual and reconciled data and gives us the ability to produce further management reporting at a higher quality level.” • The success of the system has led to plans to extend it to the bank's finance operations, analysing loan and interest data, and performing 'what-if' scenarios using graphical data. • Sookun added: "The bank is now able to analyse data in different formats and produce subsets of reports as and when required, which used to be a very labour-intensive task. Implementing the system also unexpectedly highlighted inaccuracies in our existing data and we are now confident in consistent, high quality reporting. The beauty of the new system is that it can also integrate with our existing systems.”
Manufacturing • Business Opportunity • Increase sales • Rapid Racking is a UK manufacturer of shelving and racking products. Their pre-data analytics approach was to blast 160-page generic catalogs to millions, creating an unacceptably high acquisition cost. • Key Performance Indicators: Reduced acquisition costs • Big Data Attribute: High Speed Analytics • What was done • By using analytics software and current customer data, Rapid Racking created an algorithm, applied it to third-party data to predict the most promising prospects who then received smaller, customized catalogs. • Result • Rapid Racking decreased acquisition costs by 47% while increasing revenue by 8%.
Gambling Industry • Business Opportunity • Drive profits • Caesar’s Entertainment decided to forgo the Las Vegas glitz as the primary driver of profits and instead focus on customer behavior. • Key Performance Indicators: Customer experience • Big Data Attributes: Data Mining Multiple Sources • What was done • By implementing a Total Rewards program, Caesar’s tracks about 80% of customer spending. This information helped create Caesar’s “luck ambassador” program. If a customer reaches a certain loss level at the casino, a luck ambassador delivers a special reward – such as show tickets or dinner – based on the customer’s known preferences. • Result • Caesar’s delivers positive customer experiences. Further, Caesar’s data-driven approach revealed that 0.15% of customers generated 12% of Caesar’s revenue.
Auto Rentals • Business Opportunity: With over 8300 locations worldwide in 146 countries, Hertz keeps its finger on the pulse of its customers with customer satisfaction. The problem? How to collate the information and understand what customers were trying to tell them through these surveys? • Key Performance Indicators: Customer satisfaction • Big Data Attribute: High Speed Analytics • What was done • By applying advanced analytics solutions, the company was able to process the information much more quickly–in half the time it previously took, while at the same time providing a level of insight previously unavailable to the company. • Result • While evaluating the solution, Hertz was able to identify a potential area for improvement in Philadelphia: surveys and measurements indicated that delays are occurring for returns during specific times of the day. By investigating this anomaly, Hertz was able to quickly adjust their staffing levels at the Philadelphia office during those peak times, ensuring a manager was present to resolve any issues. This enhanced Hertz’s performance, and increased customer satisfaction…all by parsing the volumes of data being generated from multiple sources.
Mobile Phones • Business Opportunity: Better predict customer defections • Key Performance Indicators: Reduce number of lost customers • Big Data Attribute: Data Mining Multiple Sources • What was done • Integrated Big Data across multiple IT systems to combine customer transaction and interactions data. • Leveraging social media data along with transaction data from CRM and Billing systems. • Result • T-Mobile USA has been able to cut customer defections in half in a single quarter.
Healthcare • Business Opportunity: Reduce the occurrence of high cost Congestive Heart Failure (CHF) readmissions. • Key Performance Indicators: Reduced cost • Big Data Attribute: High Speed Data Analytics • What was done • By proactively identifying patients likely to be readmitted on an emergent basis, they applied predictive models and examined analytics through which providers can intuitively navigate, interpret and take action. • Result • For Seton, a reduction in costs and risks associated with complying with Federal readmission targets. For Seton’s patients, fewer visits to the hospital and overall improved patient care. Seton is able to identify patients likely for re-admission and introduce early interventions to reduce cost, mortality rates, and improved patient quality of life.
Healthcare • Business Opportunity: The goal is to somehow create one complete electronic record for patients that includes all their data (images, pharmacy records, clinical notes, self-reported patient information), regardless of underlying system and to combine that information with relevant financial, genomic, and research data to provide a holistic view within the EMR, said Lisa Khorey, the VP of enterprise services and data management at UPMC. • Key Performance Indicators: Reduced cost • Big Data Attribute: Data Mining Multiple Sources • What was done • UPMC has aggregated data from those 31 provider-based systems. • Result: There have already been payoffs, Shrestha said. Because doctors can see not only what has been prescribed across all EMR systems but also the claims information, they know which prescriptions have been filled. “This is all about filtering the noise to get data that is actionable. In this case we won’t treat the patient with a pseudo condition of acute abdominal pain but get to the root of his problem, which is probably opiate addiction,” Shrestha said.
Freight • Business Opportunity • Lower costs by reducing fuel consumption. • With fuel being one of the biggest overheads for freight train companies (at Canadian Pacific, one user of GE’s system, it makes up nearly one-quarter of operating costs), a 10% reduction in fuel use represents a huge cost saving. • Key Performance Indicators: Reduced operating cost. • Big Data Attribute: Data Mining Multiple Sources • What was done • “Trip Optimizer” is a fuel-saving system that GE has developed for freight trains. It takes into account a wealth of data, including track conditions, weather, the speed of the train, GPS data and “train physics”, and makes decisions about how and when the train should brake. • Result • Trip Optimizer reduced fuel use by 4-14%.
Utilities • Business Opportunity: Reduce household power consumption. • Key Performance Indicators: Household consumption decrease. • Big Data Attribute: Volume • What was done • Opower currently manages about 30 TB of information (and growing), which includes energy data from 50 million utility customers (across 60 utilities) as well as public and private data about weather and demographics, historical utility data, geographical data and much more. The data is stored and processed in a combination of over 20 MySQL databases and a production Hadoop cluster. • Most of Opower’s data is structured, with the exception of its systems-logs processing infrastructure. The data is processed in batch processes that access both MySQL and Hadoop, and the current production Hadoop cluster is 12 nodes; that is 80 TB of usable space, 72 cores, 0.5 TB of memory and 120 spindles. The Opower analytics team also uses Pentaho analytics and R in its regular business intelligence work. • Result • The result of all of these new tools is that Opower can help utility customers shave about 2 percent off their home energy consumption by showing customers how well (or poorly) they are doing compared to their peers and neighbors (tapping into shame or guilt) or suggesting other tips like adding energy-saving lightbulbs. • Save 700 million kilowatt-hours to date, which is the equivalent of 1 billion pounds of greenhouse gas emissions and the annual output of 90,000 cars.
IT • Business Opportunity: Enhancing security of data centers • Key Performance Indicators: Reduction in attacks • Big Data Attribute: Volume • What was done • RSA recently made available its RSA Security Analytics platform that's intended to be used to detect attacks, especially stealthy ones, by analyzing large amounts of content data that would be stored in a Hadoop database for threat-detection analysis in conjunction with RSA's security-event and information management product, enVision. • Result • RSA's Big Data Security push, says Steve Schlarman, eGRC solutions manager at RSA, is "adding business context" intelligence related to enterprise content that goes beyond traditional security alerts to help companies defend themselves against stealthy attacks in particular.
How Do You Get Started? • Develop List of Opportunities to Leverage Big Data • Identify a Data Analyst • Identify Data Sources • Assess Data Quality Gaps • Develop Data Aggregation Approach • Traditional Leveraging Relational Database • Hadoop Approach • High Performance Computing Cluster(HPCC) • Do you have the tools? • Create Data Improvement Plan • Simulate Data Aggregation Process on a few cases • Assess Results • Is quality sufficient to proceed? • Execute in Phases
HPCC Comparison To Hadoop • While Hadoop has two scripting languages which allow for some abstractions (Pig and Hive), they don’t compare with the formal aspects, sophistication and maturity of the ECL language which provides for a number of benefits such as data and code encapsulation, the absence of side effects, the flexibility and extensibility through macros, functional macros and functions, and the libraries of production ready high level algorithms available. • A restriction of the MapReduce model utilized by Hadoop, is the fact that internode communication makes certain iterative algorithms that require frequent internode data exchange hard to code and slow to execute. In contrast, the HPCC Systems platform provide for direct inter-node communication at all times, which is leveraged by many of the high level ECL primitives. • Another disadvantage for Hadoop is the use of Java as the programming language for the entire platform, including the HDFS distributed file system, which adds for overhead from the JVM; in contrast, HPCC and ECL are compiled into C++, which executes natively on top of the Operating System, lending to more predictable latencies and overall faster execution • But above all, the HPCC Systems platform presents the users with a homogeneous platform which is production ready and has been proven for many years in our own data services, from a company which has been in the Big Data Analytics business even before Big Data was called Big Data.
Three Primary Components Of HPCC Systems • THOR – performs data ETL • “Identify and catalog all the fish in the ocean.” • Massively parallel Extract Transform and Load (ETL) engine • Enables data integration on a scale not previously available: • Suitable for: • Massive joins/merges • Massive sorts & transformations • Any N2 problem • ROXIE – performs data delivery • “Get me that fish, right there, right now!” • Massively parallel, low latency, high availability structured query response engine • Ultra fast due to its read-only nature • Suitable for: • Volumes of structured queries • Full text ranked Boolean search • Enterprise Control Language (ECL) – a common language to accomplish both • “ECL is to Big Data as SQL is to Relational Data” • An easy to use , data-centric programming language optimized for large-scale data management and query processing • Highly efficient; Automatically distributes workload across all nodes. • Industry analysts estimate 80% more efficient than C++, Java and SQL and 1/3 reduction in programmertime to maintain/enhance existing applications • Benchmark against SQL (5 times more efficient) for code generation
HPCC – A New Solution? • The HPCC is not new. It has been in use for over a decade in critical production environments; the wall has been touched a good number of times and each time it has been pushed back. For many groups starting with the HPCC today it is unlikely the wall will be hit. Gain a good understanding of the types of things HPCC has been used for in our Introduction to HPCC whitepaper. • The HPCC platform is designed to be extended; we have the support options and custom coding services available to move the wall if needed.
Takeaways • The time when an executive could profess ignorance about the power of data has long passed. • Evidence of the advance of data is everywhere. A new breed of US graduates is leaving university with the aim of becoming data scientists—a job title that few had heard of just a decade ago. • The full impact of big data is still to come. Many data sets remain in organizational silos, cut off from a company. • Organizations that lock up their data may fail to realize the benefits that may flow from sharing them internally, or from selling them to other companies. • We can expect the flow of information, already a powerful force, to become even transformative. • Will your company benefit? • There is no reason why it should not!
Who We Are • Comrise was established in 1984 and is a global consulting firm with headquarters in the U.S. and China. Our teams specialize in Managed IT, Big Data, and Workforce Solutions – Staff Augmentation, Recruiting, RPO, and Payrolling. With nearly 30 years of experience, Comrise provides local talent and resources on a global scale. • Paul Banks – jpaulbanks@comrise.comBusiness Development(610) 217-7763 mobile • Matt Matulewicz – matt.matulewicz@comrise.comSolution Architect(610) 703-4352 mobile