200 likes | 362 Views
UNCLASSIFIED. We Made a Few Trillion New Pixels Yesterday. Here’s Why. 10JUN11 Terry Busch DIA This briefing is classified UNCLASSIFIED. UNCLASSIFIED. UNCLASSIFIED. DIA manages a large GIS enterprise. Over the years it became, a large IT dinosaur Complicated Heavy Expensive
E N D
UNCLASSIFIED We Made a Few Trillion New Pixels Yesterday. Here’s Why. 10JUN11 Terry Busch DIA This briefing is classified UNCLASSIFIED UNCLASSIFIED
UNCLASSIFIED DIA manages a large GIS enterprise • Over the years it became, a large IT dinosaur • Complicated • Heavy • Expensive • Proprietary “I love deadlines. I like the whooshing sound they make as they fly by.”-- Douglas Adams UNCLASSIFIED
UNCLASSIFIED And then we discovered Open Source Technology • And as a bi-product we became • Lightweight • Less Expensive • Interoperable • Faster (yeah, that’s right). “Bill Gates is a very rich man today... and do you want to know why? The answer is one word: versions.” -- Dave Berry UNCLASSIFIED
One nice piece of technology we found was Tile Cache (Map Layers) • Simple caching tool for map chips • Very, very fast (at home speeds) • The client actually makes its own cache? • Wait, wait. What can we do on the server? “Technology does not run an enterprise, relationships do.” -- Patricia Fripp INTELLIGENCE DATA PARAMETRIC DATA TECHNICAL DATA INFRASTRUCTURE DATA OPERATIONAL DATA DOCTRINE STRATEGY TACTICS
Tile Cache seemed a panacea • We don’t have Google or Bing or MapQuest • Map Services had always been – “slow to load” (my fault) • Our map services never quite hit the tipping point. • Free (Cost Neutral) “Birds do it, bees do it, even educated fleas do it. Let's do it, let's fall in love.” -- Cole Porter
Light Bulb! • So What if We? • Cached the entire world? • I mean everything we had: • Scanned Maps • Imagery Maps • Elevation • That sink over there? • It’s just disk! “An idea that is not dangerous is unworthy of being called an idea at all” --Oscar Wilde
Stop the Presses • A few problems: • Scripting for tile caches is kinda slow. • So, basically, for our datasets… • It would take 2 years to process on our best servers! • Ugh “Your heart is my pinata” –Chuck Palahniuk “Love is a battlefield” – Pat Benatar
The Holy Grail • Aha! What if we went to the cloud: • Amazon EC2 • Instead of scripting on a few servers…how about a few thousand servers? • Cost = Peanuts “Oh, wicked, bad, naughty Zoot! She has been setting alight to our beacon, which, I just remembered, is grail-shaped. It's not the first time we've had this problem” -- Dingo
You want the truth? • 2 Year core processing effort reduced to 24 hours. • 3+TB raw data 30TB tile cache to 18 Predefined Levels. • Map Layers: Image Maps, Military Charts, Elevation Data, and much much more. Above: An scanned digital map of Haiti. Using tile cache we scanned every map and image map in our archives.
Handling the Truth • Initial Effort: Created 1.75 Trillion new pixels (2009). • At one point in processing we harnessed about 5000 servers simultaneously. • Can process tile cache layers at 2, 4, 8, million pixels per second. Above: Open Street Maps (OSM) tile cache services are amongst our most popular.
So Now: • Got rid of all the wires and databases and junk. • 2-3 Million service requests per month. • Stopped spending my GIS money on servers and expensive hardware. Above: Satellite Imagery tile cache Blended with some sort of vector layer.
Oh, and… • Scalable (just add disk) • Portable (if you have data bricks) • Updatable (just re-cache) • Version proof Above: Imagery elevation blends Have become as popular in our world as it has in public.
Cloud Computing is not error free • Had to write error handling • Read and react in real-time. • Oh, and the bandwidth I/O problem… Q. How does one move 30Tb off Amazon. A. Data Bricks “It’s hard enough to find an error in your code when you’re looking for it; It’s even harder when you’ve assumed your code is error-free.” – Steve McConnell
But wait! There’s more… • We had all this data sitting in the cloud…. • What if we solved some other problems? • Saved some people some time? • Created app ready surfaces for analysis? “Without promotion something terrible happens... Nothing!” -- PT Barnum
G++ Base Layers for GIS With the data left hanging around we created: • Slope • Aspect • Hillshade • Terrain Ruggedness • Relevation For the whole world… Above: Examples of surface area of The world cached with a chip from Slope and Aspect global layers.
Can We? • Eliminate the status bar wait (get coffee GIS) • Get rid of the slope button • Prepare ourselves for a web-driven GIS world • Let our analysts….be analysts!! Terrain Ruggedness Index: A key index for understanding how humans interact with topography for site selection and mobility.
That’s Levitation Holmes! • So, now, if you want a travel cost model, we’ve got your base. • Want a viewshed, line-of sight? Done. • Need site hel for your weekend home or establishing you vineyard? Check. • GP Ready. Relevation: Foundational layer for identifying landform categories (e.g. ridges, valleys, side slopes…). Exceptionally useful for suitability modeling and routing.
Cloud GIS – MrGeo • Map Reduce Geo • Distributed or Cloud GIS • Helping us with that crowd-source mass-data scaling problem we face in the future now. • Gives us the tools we need for our crowd sourced very busy data future. • Somewhat Open Source MrGeo: Examples of cloud based processing as elevation data for the world is calculated into elevation derived layers.
That was all so 2009 • More Hybrid Layers for the Cache (More OSM Please). • Hydrology layer for the whole world • Global Change Detection • World Remoteness • Give me some ideas here people! “There's a fine line between fishing and just standing on the shore like an idiot.” - Steven Wright
Buy these guys a beer for answers: • Geo-Cache: Frank Porcelli • Distributed GIS: Lou Paladino • Geo for Amazon: Brian Levy • MrGeo: Jason Surratt • Scripting/Models: Eric Finnen • OSM/VGI: Those Guys • The Way the World Ought to Be: Me “Hisssssss [Chaka must die].” - Sleestak