260 likes | 369 Views
Cloud Computing @Yahoo!. Dekel Tankel Director, Product Management Yahoo! Cloud Computing dekel @yahoo- inc.com. IGT, June 2009. What we’ll cover today…. Why Cloud? Scale and Abstraction; Quality and Agility Yahoo!’s unique footprint Yahoo!’s Cloud Strategy
E N D
Cloud Computing @Yahoo! Dekel Tankel Director, Product Management Yahoo! Cloud Computing dekel@yahoo-inc.com IGT, June 2009
What we’ll cover today… • Why Cloud? • Scale and Abstraction; Quality and Agility • Yahoo!’s unique footprint • Yahoo!’s Cloud Strategy • Overview of the Yahoo! Cloud vision and portfolio • Deep dive on Horizontal & Functional Cloud Services • The Yahoo! Open Strategy • Marrying Yahoo!’s “Open Strategy”, its platforms and ethic with external Cloud services
Why Cloud? Benefits for Yahoo! Higher Agility & Stability while maintaining Scale • Abstraction • Enable developers to focus on their applications, not infrastructure • Accelerating innovation • Adding new features and products at an ever faster rate • Increasing Scale & Availability • More robustly, more globally, more completely, for a given budget Cloud is pushing up the Operation Excellence Curve Agility & Innovation Quality & Stability
Yahoo!’s Unique Cloud: Unprecedented Scale • Massive user base and engagement • 500M+ unique users per month • Hundreds of petabyte of storage • Hundreds of billions of objects • Hundred of thousands of requests/sec • Global • Tens of globally distributed data centers • Serving each region at low latencies • Challenging Users • Rapidly extracting value from voluminous data • Downtime is not an option (outages cost $millions) • Variable usage patterns
Yahoo! Cloud Services Users Applications Functional CloudServices Horizontal Cloud Services Physical Layer ROI & Innovation Y!OS, BOSS, YQL, APT, Analytics, … Storage, Batch, Edge Serving,…
Yahoo! Cloud Services: Focus on PaaS offerings Users Applications Functional CloudServices Horizontal Cloud Services Physical Layer ROI & Innovation SaaS PaaS IaaS
From Infrastructure to Shareholders benefit • Horizontal Cloud • Focus on open source and collaborative R&D with industry, academia and government • Functional Cloud • Focus on developing "open strategy" frameworks, tools and services for developers (at Yahoo! and beyond) • Combined Together • Leverage our unique scale, assets and data to drive disruptive innovations in the market and expand Yahoo!’s competitive differentiation
Yahoo! Cloud Strategy in Action:The Front Page Case Study • Horizontal Cloud – Storage & Hadoop • Analyze extremely large content data sets • Functional Cloud – Content Optimization • Rate content items based on various parameters • Applications – Yahoo’s Front Page • Display “high rating” items to the right users • Benefit consumers and advertisers and grow Yahoo!’s revenue
Yahoo! Cloud Strategy in Action:The Inquisitor Case Study • Horizontal Cloud – Hadoop • Analyze large search-index data sets • Functional Cloud - BOSS • Expose the data in a structured, open, flexibleand “cloud like” way • Applications - iPhoneTM Inquisitor • Leverage BOSS to provide innovative consumer experience • Benefit consumers and grow Yahoo!’s revenue
Horizontal Cloud Services Users Applications Functional CloudServices Horizontal Cloud Services Physical Layer ROI & Innovation
Horizontal Cloud Services • Optimized for Yahoo!-scale • Yahoo!-internal focus • Data processing and serving environments • Drive faster innovation and agility • Shorter product development cycles • Reduce labor and costs for infrastructure • Multi-year effort • Strategic investment across the company
Horizontal Cloud Services: Conceptual View Simple API’s Operational StorageStructured, unstructured Batch Storage & ProcessingHadoop, PIG Edge Content ServicesCaching, Proxies Online Serving Web, Data Security and Authentication Metering, Billing Monitoring & QoS ID & Account Management Provisioning & Virtualization (Xen) Shared Infrastructure Common Approaches to QA, Production Engineering, Performance Engineering, Datacenter Management, and Optimization
Horizontal Cloud Services: Use Cases Search Index Content Optimization Machine Learning (e.g. Spam filters) Ads Optimization Attachment Storage Image/Video Storage & Delivery
Yahoo! Distribution of Hadoop • Hadoop in a nutshell • Open source distributed file system & parallel execution environment to process massive amounts of data • Started in 2005, became top-level Apache project in 2008 • Simple Design for Horizontal Scaling on commodity HW • Yahoo! Distribution of Hadoop • Source distribution of Yahoo!’s implementation of Hadoop(Based entirely on code found in the Apache Hadoop) • Tested and deployed at Yahoo!’s massive scale • Benefit the larger ecosystem , Increase pace of innovation • http://developer.yahoo.com/hadoop
Yahoo! runs the largest Hadoop Clusters in the World • 25,000+ nodes • Clusters of up to 4,000 nodes • 4 Tiers of clusters • Development & Testing, POCs, Science & Research, Production • Terasort Benchmarks • 62 seconds to sort One Terabyte (run on 1,500 nodes) • 16.25 hours to sort One Petabyte (run on 3,700 nodes) • Webmap application • ~490 TB shuffling • ~280 TB output
Case Study - Search Assist™ • Database for Search Assist™ is built using Hadoop. • 3 years of log-data, 20-steps of map-reduce • Leverage Hadoop’s scalability, load balancing and resiliency • Simplified access, flexibility for rapid innovation (from C++ to Python)
Functional Cloud Services Users Applications Functional CloudServices Horizontal Cloud Services Physical Layer ROI & Innovation
Functional Cloud Services • Provides functional capabilities for applications • Help developers to accomplish integrated web experiences in a faster and easier way • Provides common set of functional “building blocks” • “Powered by” the horizontal cloud services • Abstracts infrastructure services from the Application • E.g. Storage, Compute, Serving, Robustness and Scalability • Self-Served, Global, Managed, Elastic and Metered
Functional Cloud Services: YQL & BOSS Build your Own Search Service Yahoo! Query Language A single endpoint service that enables developers to query, filter and combine data across Yahoo! and beyond http://developer.yahoo.com/yql/console/ Providing Yahoo! Search infrastructure and technology to developers and companies to help them build their own search experiences http://developer.yahoo.com/search/boss/
Build your Own Search Service (BOSS) • Yahoo!'s open search web services platform • Serving hundreds of millions of users across the Web. • Goal: foster innovation in the search industry • Build and launch web-scale search products that utilize the entire Yahoo! Search index. • Access to Yahoo!'s investments in crawling and indexing, ranking and relevancy algorithms
Yahoo! Query Language (YQL) • Single endpoint service to query, filter and combine data across Yahoo! and beyond • The “Internet API” • SQL-like SELECT syntax for getting the right data • Quickly discover available data sources and structure • Combined data from a single web browser • Easy-to-use Consol • http://developer.yahoo.com/yql/console/
Y!OS and Cloud Strategy CLOUD SERVICES 24
Open Collaborations around the globe • M45 - Yahoo!’s supercomputing cluster • 4,000 cores, 3 TB RAM, 1.5 PB disks, 27 teraflops! • Operational since November 2007, 4 major Universities • Focus on highly parallel computing • Open Cirrus™ with HP & Intel • A global, multi-data center, open source test bed • Target to advance cloud computing research & education • Simulates a real-life, Internet-scale environment • 9 Global sites, more than 50 research projects
Questions? Dekel Tankel Director, Product Management Yahoo! Cloud Computing dekel@yahoo-inc.com