170 likes | 290 Views
Cloud Computing Clase 7. Miguel Saez @ masaez. Matias Woloski @ woloski. Johnny Halife @ johnnyhalife. Based on a slide deck from Steve Huffman presented on May 2010. Lecciones Aprendidas en Reddit. Sitio : Reddit.com
E N D
Cloud ComputingClase7 Miguel Saez @masaez Matias Woloski@woloski Johnny Halife @johnnyhalife Based on a slide deck from Steve Huffman presented on May 2010
LeccionesAprendidas en Reddit • Sitio: Reddit.com • Objetivo: entender lo quesignificahacerunaaplicacion web querecibe 270 millones de page views pormes • http://vimeo.com/10506751 • Puntos mas importantes • Esquemaabierto • Procesamientoasincronico • Stateless • Caching
A brief history of reddit Founded in June 2005 Acquired by Condé Nast October 2007 7.5 Million user / month 270 Million page views / month Many mistakes along the way
Lesson 1: Crash! …and restart. Daemontools (supervise) Single greatest improvement to uptime we ever made. When in doubt, let it die. Don’t forget to read the logs!
Lesson 2: Separation of services Often, one->two machines more than doubles performance. Group similar process together. Group similar types of data together. Better caching. Less contention for CPU. Avoid threads. Processes are easier to separate later.
Lesson 3: Open Schema In the early days: Too much time spent thinking about the database. Every feature required a schema update. Schema updates became more painful as we grew. Maintaining replication was difficult. Deployment was complex.
Lesson 3: Open Schema Data Thing
Lesson 3: Open Schema With an open schema: Faster development Easier deployment Maintainable database replication No joins = easy to distribute Must be careful to maintain consistency
Lesson 4: Keep it stateless Goal: any app server can handle any request App server failure/restart is no big deal Scaling is straightforward Caching must be independent from a specific app server.
Lesson 5: Memcache everything Database data Session data Rendered pages Memoizing internal functions Rate-limiting (user actions, crawlers) Storing pre-computing listings/pages Global locking Memcachedb for persistence
Lesson 6: Store redundant data Recipe for slow: keep data normalized until you need it. If data has multiple presentations, store it in multiple times in multiple formats. Disk and memory is less costly than making your users wait.
Lesson 7: Work offline Do the minimum amount of work to end the request. Everything else can be done offline. An architecture of queues is simple and easy to scale. AMQP/RabbitMQ.
Lesson 7: Work offline Pre-computing listings Fetching thumbnails Detecting cheating Removing spam Computing awards Updating the “search” index
Lesson 7: Work offline Cache Master Databases Request App Servers Precomputer Thumbnailer Spam Queue Worker Databases
Consigna • Reimplementar la funcionalidad de ranking de “el Prode” utilizando lo aprendidoluego de habervistoestapresentacion