80 likes | 97 Views
Telegraph Status. Joe Hellerstein. Overview. Telegraph Design Goals, Current Status First Application: FFF (Deep Web) Budding Application: Traffic Sensor Data Moving Forward. Telegraph: Adaptive Dataflow. Dataflow Siphon data from the “deep web” Harness data streaming from sensors/traces
E N D
Telegraph Status Joe Hellerstein
Overview • Telegraph Design Goals, Current Status • First Application: FFF (Deep Web) • Budding Application: Traffic Sensor Data • Moving Forward
Telegraph: Adaptive Dataflow • Dataflow • Siphon data from the “deep web” • Harness data streaming from sensors/traces • Flow through code • The API and Architecture for ubiquitous computing • Why adaptive? • Sensor nets & wide area internet: volatile! • Like Telegraph Avenue, need to roll w/the changes • Adaptive techniques for routing data to machines & code
Demos Delivered! • The big push: FFF Election 2000 demo 10/2000 • Got Telegraph off the ground and live • Shows power of analysis & integration on web • It’s not just search any more! • Served thousands of live, long-running queries • Initial Sensor Demo • UCB Institute for Transportation Studies data • Various web cams • Project for SIMS InfoVis class • A harness for more sensor-oriented work in Telegraph
Telegraph v1 (alpha) infrastructure • Single-site (multi-source) dataflow engine • All Java: some lessons here (paper in preparation) • Numerous dataflow operators built • TeSS (Telegraph Screen Scraper) • File reader • Relational ops (filters, joins, grouping, aggregation) • Some simple sequence analysis ops • Eddy: adaptive flow ordering operator • Key architectural theme: gain adaptivity via new operators • Not changes to dataflow infrastructure! • This is our upgrade strategy to parallelism/distribution • SQL-to-Dataflow parser • SQL is a fine dataflow language for many tasks
static dataflow eddy eddy + stem Upcoming Telegraph Operators • Goal: Further adaptivity through competition • Multiple mirrored sources • Handle rate changes, failures, parallelism • Multiple alternate operators • STeM operator manages tradeoffs • STate Module, unifies caches, rendezvous buffers, join state • Competitive sources/operators share building/using STeMs Vijayshankar Raman
Parallelism & Fault Tolerance Continuous/long-running flows need fault-tolerance Big flows need parallelism Adaptive Load-Balancing req’d FLUX operator: Exchange plus… Adaptive flow partitioning River Mobile operator state for full Load Balancing Replicated flows & redundant state (RAID for operators) Load rebalancing vs. vulnerability Mehul Shah & Sirish Chandrasekaran Telegraph Nuts and Bolts 2
Further Directions & Goals • Deep Web Trawling & Privacy Issues • We’re about to crawl web DBs (What? How much?) • Can do some fascinating/creepy things • Consider privacy & accuracy: countermeasures, incentives, etc Mehul Shah (W/Varian, Papadimitriou, L. Hellerstein & T. Suel) • Data Dissemination & Continuous Queries • Franklin’s XFILTER: XML pub/sub • New automata-based techniques from CS262 • Extend/integrate for pub/sub on general Telegraph flows Yanlei Diao/Asha Tarachandani • Sensor/Trace Data Apps • Bay Area traffic. Would like to do TinyOS (nobody on it yet) • Software traces? OceanStore? Sam Madden