190 likes | 628 Views
Data Warehousing Case Study. Akamai Technologies, Inc. Background. In 1997, Tom Leighton (MIT Professor Applied Mathematics) and Danny Lewin (MIT Graduate Student), along with others, developed mathematical algorithms to handle the dynamic routing of web content.
E N D
Data Warehousing Case Study Akamai Technologies, Inc.
Background • In 1997, Tom Leighton (MIT Professor Applied Mathematics) and Danny Lewin (MIT Graduate Student), along with others, developed mathematical algorithms to handle the dynamic routing of web content. • In 1998, the group entered the annual MIT $50K Entrepreneurship Competition, where the company's business proposition was selected as one of 6 finalists among 100 entries. • In April of 1999, Akamai launched it’s commercial service, FreeFlow, for Yahoo! – Akamai’s 1st and charter customer.
Akamai Today • Today, Akamai has over 1000 customer in countries all over the world. • Akamai's intelligent edge platform for content, streaming media, and application delivery comprises more than 11,600 servers within over 820 networks in 62 countries.
Reporting @ Akamai • Company Growth of over 50% per Quarter from 1999 to 2001. • Assets (Servers, Switches, etc.) in hundreds of Networks around the World. • Increased Product Lines from 1 Product (FreeFlow) to more than a dozen Products (FreeFlow Streaming, Edgesuite, FirstPoint, etc.). • Internal Growth from one hundred employees to thousands in one year. • Internet Growth (dot.com) explosive through 2000.
Reporting @ Akamai, cont. • Internet Bubble Explodes in March 2001, causing a backlash on the Companies who serviced dot.coms • Customer churn (cancellation) increases rapidly. • Revenue collected from bankrupt customers declines. • Accurate and Comprehensive Data to Base Management Decisions becomes CRITICAL. • Management Reporting Initiative (MRI) is born.
Where do you start?? • Prioritization Process • Identify pain • Determine readiness • Data maturity • Size • Complexity • In the end, who do you choose?
Requirements Gathering • Requirements Gathering Team composed of Technical Leader (myself) and Business Systems Analyst began a 2 month process of gathering requirements • Identified key verticals within company • Identified single points of contact (SPOC) within vertical • Identified subject matter experts (SME) within organization • Identified key stakeholders within organization • Conducted interviews, JAD sessions and working sessions with individuals and groups as appropriate. • Compiled 100+ pages of Requirements from the Business Community.
Scope and Project Charter • Defined Scope based on Requirements (Scope Creep!!!) • Developed Project Charter defining • Project Scope • Project Organization • Critical Success Factors • Assumptions and Constraints • Risks • Issues • Sign off from Executive Management and Project Sponsors
Technical Architecture • Vendor selection • ETL: Informatica PowerMart 4.7 • Front-end: Brio.Insight 6.3 • Middle-ware: Brio OnDemand Server 6.3 • Database: Oracle 8.1.7 • Database Design: ERWin 3.52 • Software/Hardware Procurement and Implementation • 3 Solaris SPARC 2.7 boxes • 750 GB Storage Area Network (SAN)
Project Plan • Battle between the Technical Team and the Executive Sponsors • Executive Sponsors couldn’t understand why it would take so long to launch this new Enterprise Data Warehouse • Technical Team was not proficient in the new technology, nor were they staffed to accommodate the requested timeline (2 months requirements to rollout) • Result = $$$ to hire Contractors • Contractors require detailed ETL documentation • Law of Diminishing returns • Knowledge transfer from Contractors to DW Team Members
Project Plan, cont. • 2-1/2 months to complete from April 30th (begin requirements gathering) to July 16th (rollout) • Project Definition – 1 week • Requirements – 1 month • Technical Analysis – 0 days • Technical Design and Infrastructure Implementation – 2 months • Data Model – 2 weeks • Source to Target Mapping Document – 2 weeks • ETL Coding – 4 weeks • System and Unit Testing – 2 weeks • UAT – 3 weeks • Rollout – 1 week
Project Discrepancies • Support??? • Bug Fixes??? • Enhancement Requests??? • Security Review??? • Issue Resolution??? • Dirty Data??? • Broken and Undefined Business Processes….
Source to Target Mapping Document • 50+ Pages of “instructions” on HOW to code the Data Mart • Constantly changing
Project Execution • 3 months for Development and Unit Testing • 2 months (and counting) for User Acceptance Testing • Rolled out to User Community August 6th, 2001 (nearly one month late) • Report Development is on-going, with a dozen reports published and more coming in each day • Bug queue is manageable • Enhancement requests continue to pile up
Lessons Learned • Allow the majority of the Project Plan to be consumed by: • Requirements Analysis • QA • Maintain scope at all costs • Never assume the data is correct or clean • Understand that when user’s describe a “Process” that that “Process” was not always in place • Determine from the beginning how much historical data will be included in the data mart
Lessons Learned, cont. • Write down the goals of the Data Mart and pin them on the wall – look at them EVERY day • Write down EVERYTHING • Know your team • NEVER use a Data Warehouse to “smoke out broken or undefined Business Processes” • NEVER code for the Exception