390 likes | 529 Views
Gary Flake, Founder gary@clipboard.com. Building c lipboard.com Architecture, Practices, and Lessons. Outline. Introduction Architecture & Practices Lessons Q&A. Demo. Backstory. Founded by me (Gary Flake) Took ~1.4M angel investment in April, 2011
E N D
Gary Flake, Founder gary@clipboard.com Building clipboard.com Architecture, Practices, and Lessons
Outline • Introduction • Architecture & Practices • Lessons • Q&A Introduction
Demo Introduction
Backstory • Founded by me (Gary Flake) • Took ~1.4M angel investment in April, 2011 • 6+ full time employees (almost all dev): • Mark Dawson, Greg Pascale, Ken Perkins, Tommy Montgomery, Steve Courtney • Investors include AH, Index, FRC, SVA, FCO, Betaworks, Crunchfund, … individual angels • 12+ month runway left • Looking to hire one more engineer
Scenarios Near Term Long term Individual saving Micro-blogging Curating Collaboration Shared curating Link aggregation Application and service backup Personal data visualization Web search Advertising Clip platform Introduction
Overlapping Clip Spaces Introduction
Overlapping Clip Spaces Your private stuff Your public stuff Your stuff, selectively shared Other people’s public stuff Other’s stuff, selectively shared with you Your public stuff, explicitly shared Other’s public stuff, explicitly shared with you Introduction
Why Clipboard? • Fidelity and functionality preserved • Heterogeneous objects • Simple overlapping spaces • Shareable in several ways: • 1→1 @mention, email, permalinks • 1→N @mentions, Facebook • 1→∞ publish, twitter, embed • Tagging and search Introduction
Outline • Introduction • Architecture & Practices • Lessons • Q&A Architecture & Practices
Architectural Goals • Development Efficiency - Development speed and cost are critical for startups. • Scalability – We want to support millions of users without rewriting our whole backend. • Simplicity – Little, clear code. Few moving parts. Painless operations. • Combination helps towards other goals. Architecture & Practices
Architecture web-01 Node.js + Nginx web-02 Node.js + Nginx web-03 Node.js + Nginx riak-01 riak-05 riak-02 cache-01 redis-01 cache-02 redis-02 riak-04 riak-03 cache-03 admin-01 job-01 thumb-01 thumb-02 Architecture & Practices
Other Infrastructure Parts • Rackspace API for spinning up/down VMs • AWS for thumbnails storage and CDN • A few 3rd party components: • Mixpanel and Google for analytics • Sendgrid for email • Paper Trail for log aggregation • Scout for monitoring
Client – Single Page App • All clip views use same html page • Dependencies on jQuery and a few plugins • No fancy frameworks (sort of MVVMC) • Express, EJS, & Less on backend help • Almost no server-side composition • Backend code is essentially an API Architecture & Practices
Nginx • Lost faith in Apache long ago • Nginx is wicked fast • Handles static content (obviously) • Can act as a micro-cache for static and dynamic content (FTW!) Architecture & Practices
App Logic – Node.js • You’ve heard the arguments, but for us… • We like JavaScript • 1 dev can develop features end-to-end • JavaScript + JSON ≈ Buttah! • Easy to make stateless easy to scale out • Well-suited for Riak Architecture & Practices
Redis • Lightning fast in-memory key-ADT store: • Atomic operations for mutations, so no locks, nor write contentions • Excellent complement to Riak • Uses: top lists, session tokens, notifications, batch queue, invite tokens, promises, mutex Architecture & Practices
Memcached • Simple cache invalidation for K/V reads. • We make no attempt to do proper cache invalidation on search cache. • Instead, we embrace eventual consistency as a way of life. • Translation: object have type specific TTLs that range from seconds to a few minutes. Architecture & Practices
Operations • Hosted on VMs at Rackspace • Staging and test clusters identical to production. Dev on Vagrant. • Puppet for managing configurations • Build and deployment done with home grown tools: • Devdo: handles stuff on dev box side • Manage: handles stuff on cloud side Architecture & Practices
Riak An awesome noSQL data store: • Super easy to scale up AND down • Fault tolerant – no SPoF • Flexible schema • Full-text search out of the box • Can be fixed and improved in Erlang (the Basho folks awesomely take our commits) Architecture & Practices
Riak – Basics • Data in Riak is grouped buckets(effectively namespaces) • Basic operations are: • Get, save, delete, search, map, reduce • Eventual consistency managed through N, R, and W bucket parameters. • Everything we put in Riak is JSON • We talk to Riak through the excellent riak-js node library by Francisco Treacy Architecture & Practices
Data Model – Clips ctime title domain author tags annotation mentions Architecture & Practices
Data Model - Clips <html> … </html> Comments on Clip ‘abc’ “F1rst” key: abc “Nice clip yo!” Blob Key: abc “Saw this on Reddit…” Clip Comment Cache Clips are the gateway to all of our data Architecture & Practices
Other Buckets • Users • Blobs • Comments • Templates • Counts • Search Caches • Transactions Architecture & Practices
Riak Search • Gets many things out of Riak by something other than the primary key. • You specify a schema (the types for the field within a JSON object). • Works great but with one big gotcha: • Index is uses term-based partitioning instead of document-based partitioning • Implication: joins + sort + pagination sucks • We know how to work around this Architecture & Practices
Riak Search – Querying • Query syntax based on Lucene • Basic Query text:funny • Compound Query login:gregOR (login:gary AND tags:riak) • Range Query ctime:[98685879630026 TO 98686484430026] Architecture & Practices
Clipboard App Flow node.js Client Riak Go to clipboard.com/home Search clips bucket query = login:greg Top 20 results Top 20 results start rendering (For each clip) API Request for blob GET from blobs bucket Return blob to client render blob Architecture & Practices
Outline • Introduction • Architecture & Practices • Lessons • Q&A Lessons
Web development doesn’t suck • We are all indebted to Google / Chrome for making web development better and more rewarding. • “Edit build test” is the new REPL • Good debugging within the client • Fast runtime makes new apps possible
Bet on modules, not frameworks • jQuery plugins are great working examples of modules that you can take a la carte. • Frameworks are trickier because they permeate your entire code base. • You can pick the wrong module and recover, but recovery from choosing the wrong framework is much harder. • My advice: just use good code hygiene.
Open source and SaaS are critical • Open source is like lego for developers • Paid SaaS is great too – I’ll happily pay for services when: • They are better than what we could build, • Is not part of our core offering, • Frees up a dev to do something that only we can do in house.
Browsers and jQuery have bugs • We spent a lot of time tracking down bugs in surprising places: • Chrome Google Apps break bookmarklets • Safari layout can be corrupted by reading computed CSS • jQuery mishandles position:relative on body • IE8 and IE9 – don’t even get me started
Node.js is ready for prime time • This wasn’t the case a year ago. • Callback style takes time to get used to. • Common coding patterns are still ugly. • The result is pretty phenomenal: a backend that is effectively non-blocking. • It’s really great to work with the same JSON / JS objects on all 3 tiers.
Redis & Riak are yin & yang Redis Riak Documents On disk, many nodes Slow and eventually consistent Have to think about write contention • Abstract data types • In RAM, single node • Fast and atomic operations • Can handle easily write contentions
Think in terms of write contention • noSQLpatterns will have you writing a lot of independent objects. • Simple contention can be managed with a mutex, keeping code simple. • Complex contention can be batched into a work queue.
Cache, cache, cache • There is more than one way to cache. • Don’t get too clever (embrace noSQL and don’t worry about cache invalidation). • Cache in multiple places and on multiple time scales.
Balance agility with process • Dev’s do testing and deploying • Code reviews author initiated • Many small features done branched off of master. (No “dev” branch.) • Bug fixes done right on master.
Recap • We don’t have big data… yet. But we think we can handle it. • Our stack, architecture, and practicesallow us to move fast while also designing for scalability. • It’s also a really fun stack to work on. Lessons
We’re hiring! www.clipboard.com/jobs Or talk to us right now! Thanks!