270 likes | 430 Views
Web 2.0 & Google. November 3, 2005 Jaesun Han (jshan@nclab.kaist.ac.kr) NCLAB, Dept. of EECS, KAIST. Contents. Web 2.0 What is Web 2.0? Seven Principles of Web 2.0 Google The Past and Current of Google Two Axes of Google Tech Googleplex Virtual Application Google and Competitors.
E N D
Web 2.0 & Google November 3, 2005 Jaesun Han (jshan@nclab.kaist.ac.kr) NCLAB, Dept. of EECS, KAIST
Contents • Web 2.0 • What is Web 2.0? • Seven Principles of Web 2.0 • Google • The Past and Current of Google • Two Axes of Google Tech • Googleplex • Virtual Application • Google and Competitors
Seven Principles of Web 2.0 • 1. The Web as Platform • 2. Harnessing Collective Intelligence • 3. Data is the Next Intel Inside • 4. End of the Software Release Cycle • 5. Lightweight Programming Models • 6. Software Above the Level of a Single Device • 7. Rich User Experiences
1. The Web as Platform • Netscape vs. Google • Software licensing and control over APIs vs. control over data • The value of the software is proportional to the scale and dynamism of the data it helps to manage. • DoubleClick vs. Google AdSense • The long tail : the collective power of the small sites make up the bulk of the web’s content. • Leverage customer-self service and algorithmic data management to reach out the entire web, to the edges and not just the center, to the long tail and not just the head. • Akamai vs. BitTorrent • The service automatically gets better, the more people use it.
2. Harnessing Collective Intelligence • The Architecture of Participation • Users add value • It is an inclusive function to aggregate user data and build value as a side-effect of ordinary use of the application • Network effects from user contributions are the key to market dominance in the Web 2.0 era • Blogging and the Wisdom of Crowds • RSS, Trackback, Web Syndication, New Aggregator • Examples
3. Data is the Next Intel Inside • Data is indeed the Intel Inside of famous services • Google’s web crwal, Yahoo!’s directory, Amazon’s DB of products, MapQuest’s map DB, Napster’s distributed song DB … • Extending original data for real competency • Initial Map DBs (MapQuest, NavTeq) just own their original data • Amazon enhances original book DB from ISBN registry • In the Future, • Battles between data suppliers and application vendors • The race is on to own certain classes of core data • Location, identity (PayPal, Amazon’s 1-click, Sxip), calendaring of public events (EVDB), product identifiers and namespaces • User concerns about privacy, and owner’s rights to data • The rise of proprietary DB will result in a Free Data movement • Wikipedia, the Creative Commons
4. End of the Software Release Cycle • Perpetual Beta • Like open source dictum, “release early, release often” • Gmail, Google Maps, Flickr, del.icio.us etc a “Beta” logo for years • Real time monitoring of user behavior • Microsoft’s business model depends on everyone upgrading their computing environment every two to three years, while Goolge’s depends on everyone exploring what’s new in their computing environment every day • Operations must become a core competency • The software will cease to perform unless it is maintained on a daily basis • Google’s system admin, networking, and load balancing are even more closely guarded secrets than their search algorithms • Scripting languages such as Perl, Python, PHP, and now Ruby, play such a large role at web 2.0 companies
5. Lightweight Programming Models • Simple pragmatism is substituted for ideal design • Amazon’s web services • REST (XML data over HTTP) (95% usage) > SOAP web services • Mapping-related web services • Google Maps (AJAX interface) > MapQuest, MS MapPoint, ESRI • Innovation in Assembly • Reuse existing services and data for creating value • Housingmaps.com (Interactive housing search) = Google Maps + Craigslist • Several significant lessons • Support lightweight programming models that allow for loosely coupled systems • Think syndication, not coordination • e.g., RSS and REST-based web services • Design for “hackability” and remixability • e.g., browser’s “View Source”, RSS, AJAX • “some rights reserved”
6. Software Above the Level of a Single Device • Web 2.0 is no longer limited to the PC platform • iTunes • Seamlessly reach from the handheld device to a massive web back-end, with the PC acting as a local cache and control station • iTunes and TiVo also show the other core principles of Web 2.0 • Data management is the heart of their offering • They are services, not packaged applications • They show some budding use of collective intelligence • In the Future, we will • See many new services spanning multiple heterogeneous devices • Real time traffic monitoring with cars’ reporting data • Flash mobs and citizen journalism with phones’ reporting data
7. Rich User Experiences • Rich user interfaces with PC-equivalent interactivity • AJAX (Asynchronous JavaScript and XML) • The collection of technologies • standards-based presentation using XHTML and CSS; • dynamic display and interaction using the Document Object Model; • data interchange and manipulation using XML and XSLT; • asynchronous data retrieval using XMLHttpRequest; • and JavaScript binding everything together • Gmail, Google Maps, Orkut, Google Suggest, Flickr, Naver Suggest • In the Future, We will • See rich web reimplementations of PC applications • Integrated communications client combining email, IM, VoIP etc • Web 2.0-style address book (armed with social networking) • Web 2.0 word processor (with wiki-style collaborative editing) • Web 2.0 enterprise apps (like Salesforce.com providing CRM online) • The key is synergetic combination of rich interfaces and shared data
“The Web as Platform” Revisited • The meaning of “Platform” • Platform as the base on which services are developed and deployed • Platform as the playground in which users talk with one another • Platform as the point in which various devices are plugged • Platform battle • Previously, the clash is between a platform and an application • Lotus 1-2-3 vs. Excel, WordPerfect vs. Word, Netscape Navigator vs. Internet Explorer • Now, battle between two platforms • Windows Platform : massive installed base and tightly integrated operating system and APIs control over programming • Web 2.0 Platform : a system without an owner, tied together by a set of protocols, open standards and agreements for coorperation • Communication-oriented systems require interoperability Unless a vendor can control both ends of every interaction, the possibilities of user lock-in via software APIs are limited
http://www.google.com/logos.html http://www.google.org.cn/all.php
The Past of Google • 1996년~1997년: 세르게이와 래리, 구글 검색엔진의 시초인 BackRub만들다. • 1998년 상반기: 래리의 기숙사는 구글 데이터 센터로, 세르게이의 기숙사는 사무실로 변신 • 1998년 하반기: 가족, 친구, 엔젤로부터 투자받아 구글 창립, 초기 종업원 4명, 먼로파크 창고에서, 원래이름 googol • 1999년 상반기: 하루 50만건 처리 가능, 다양한 투자 • 2000년 상반기: 하루 1800만건 처리 가능, 웹페이지 10억개 색인함으로써 가장 큰 검색엔진 됨 • 2002년 상반기: Google Labs를 열어 신기술 개발 • 2003년 상반기: 신형광고시스템인 AdSense를 선보여 온라인 광고 시장에 메가톤급 충격파 • 2004년 하반기: GOOG라는 이름으로 나스닥에 $85로 기업공개(IPO)
The Current of Google • 시가 총액 800억 달러가 넘는 거대기업으로 성장 • 주가 300달러 이상 • 2005년 상반기 총 26억 4000만 달러의 매출액 (작년 대비 97%성장) • 99%가 검색광고 매출 (53% 자사 사이트, 47% 네트워크 사이트) • Google facts • 약 80억 개의 웹 페이지, 20억 개의 이미지 색인 • 2004년 기준 클러스터당 PC 2000대 할당, 모두 30개의 클러스터 2005년 색인 숫자가 두 배로 늘었으므로 클러스터 숫자도 두 배 예상 • 매력적인 구글 소프트웨어 원칙 : Do no evil • 다양한 API 지원 (http://code.google.com/apis.html) • AdWords, 블로거, 데스크탑 검색, 데스크 바, Froogle, Gmail, 구글 그룹, 구글 어스, 구글 맵, 뉴스, 구글 토크, 구글 비디오, 웹 검색 • 강력한 오픈 소스 지원 정책 (http://code.google.com/projects.html) • 끊임없는 기술 개발 (http://labs.google.com/)
Two Axes of Google Internet Googleplex Virtual Applications
Googleplex Massively distributed, highly parallelized computing a. Google Linux b. Distributed & Automated Data Center c. Logical Architecture d. Web-centric Architecuture • from 100,000 to 165,000 or more servers • 40 or more pizza box servers per rack
Googleplex Principles • Cheap Hardware and Smart Software • Use cheap commodity hardware frequent failure • Develop smart software for reducing the cost of failure • Easy Management • High Scalability by automatic discovery of new servers and racks • High Redundancy for failure of servers, racks, even data centers • Speed and Then More Speed • High speed with low cost (580MB/s read rate at $1,000 vs. 58MB/s at $18,000 IBM EXP) • Rapid development and deployment of new products • Use existing technologies • Use techniques from the leading edge of computer science • Use open source codes as a starting point
Virtual Application the data and some of the application running on servers Internet Google Maps Googleplex “No network, no application” is the rule A kernel of software running on the user’s computer
Benefits of Virtual Application • The Benefits of Virtual Application • Eliminating or reducing the software installation process • Having “live data” in the application from a network source • Users no longer have to upgrade software • Allowing an organization to replace expensive desktop PCs with less expensive, low maintenance terminals • Virtual applications are the Future • MS’ .NET 2.x and higher framework is a proprietary implementation for virtual applications • IBM’s WebSphere supports virtual applications • Yahoo offers a number of virtual applications • Google is a virtual application company
Internet Googleplex Virtual Applications Web as Platform “Two Axes of Google” Revisited
Google and Competitors • Yahoo! • Yahoo! Has grown through acquisitions • 3721.com for Chinese language search • Inktomi to provide Web search • Stata Labs for Yahoo! Mail search • AllTheWeb.com, Overture, Alta Vista, etc • Balkan-states problem • Mosaic of operating systems, hardware and software • High management resources to keep the peace • A lack of data cohesiveness limits Yahoo’s ability to know its customers • Neither a technology nor an information company. It is a media company.
But there is Windows Live Google and Competitors • Microsoft • The cost burden to support legacy applications • Windows 98 and 2000 : more than 50 % of organizational OS • For high performance, MS upgrades hardware instead of recoding the operating system itself
The Future of Google • Why Google may fail? • 증가하고 있는 소송들 (MS, Click Defense, Affinity Engine) • 검색광고에 편중된 수익모델 (99% 담당) • 회사규모의 증가와 경쟁자들의 견제 • 구글의 미래를 향한 행보 • 8월 18일 전체 주식의 4.8%를 매각, 40억 달러의 현금 확보 • 광고시장에서의 경쟁력 강화 • 다양한 광고 상품 옵션 개발 • 블로그의 RSS 피드에 구글 광고 추가하는 기술 개발 • 미디어 전달을 위한 Infrastructure에 대한 투자 (구글 넷 계획) • 구글 비디오 서비스 등과 결합해서 컨텐츠 제공 • 인스턴트 메신저 서비스와 인터넷 전화 사업에 진출 • 공상 : 데스크탑 부문에 진출 가능성? (구글 데스크탑 OS?)
References • Tim O’Reilly “What is Web 2.0” • Stephen E. Arnold “The Google Legacy” • 마이크로소프트웨어 2005년 10월호 “All About Google” • 태우’s log – web 2.0 and beyond • Channy’s Web 2.0 Blog • Web 2.0 Conference
Google SIG (Special Interest Group) • Goals : Web 2.0에 대해 Google을 중심으로 조사, 분석하여 그 성격을 파악하고 향후 진화방향을 예측하기 위한 모임 • 향후 웹에서 Academia 측면에서나 Business 측면에서 선도적인 위치 확보 • 다양한 관심분야와 전공영역의 사람들에 의한 다면적인 분석과 예측 • 개인의 전공영역과 접목하여 새로운 아이디어 도출 • 웹에 관심있는 사람들과의 지속적인 관계 형성을 위한 발판 마련 • 운영원칙 • 토론과 브레인스토밍이 중심이다. 수준은 학부생이 이해할 수 있는 정도로 한다. • 기술이나 서비스에 대한 지식 습득은 최소한으로, 실제 사용 중심으로 한다. • 단기간(11월 두째주부터 4주간)에 목표를 달성하고 이후 방향은 추후 논의한다. • 한주 한번의 공식적 모임과 한번의 비공식모임(식사시간이용)을 원칙으로 한다. • 블로그나 위키 등의 Web 2.0 기술을 이용하여 온라인 토론을 최대한 활용한다. • 모든 멤버들의 공평한 참여를 유도하며 개인생활에 최대한 지장을 주지 않도록 한다.