630 likes | 768 Views
DIFS: A Distributed Index for Features in Sensor Networks Amorphous Computing An Algorithm for Group Formation and Maximal Independent Set in an Amorphous Computer. Inside Sensor Nets. Presented by: Vartika Bhandari. DIFS: A Distributed Index for Features in Sensor Networks.
E N D
DIFS: A Distributed Index for Features in Sensor Networks • Amorphous Computing • An Algorithm for Group Formation and Maximal Independent Set in an Amorphous Computer Inside Sensor Nets Presented by: Vartika Bhandari
DIFS: A Distributed Index for Features in Sensor Networks B. Greenstein, D. Estrin, R. Govindan, S. Ratnasamy, Scott Shenker
The Context • Sensor Network • Energy-constrained devices (motes) • Motes sense environment (obtain data) • View of sensor-net as distributed database • Query-Result semantics • Lesson from databases: • Efficient query-processing requires good indexing CS 598ig
The Problem • Support for range/distribution queries • Flux densities in range 47 to 68 • Temperature between 50 and 60 degrees • Flooding not desirable • Communication costs a lot of energy! CS 598ig
Index Creation • Distributed Index • Some nodes bear more burden • Unfairness => unequal lifetimes • Desirable: • An index structure that allows both efficient querying and load balancing CS 598ig
Event Indices: An Overview • Pre-defined event types of interest • Sensors obtain data and identify event • Data summary sent to corresponding index node • A query for an event needs only access the index node(s) rather than flooding the network CS 598ig
A Search Architecture CS 598ig
Geographic Hash Tables (GHT) • Associate a logical rendezvous point with each event type e • Assign to e a name ne • Compute hash of event name to obtain a spatial location (x, y): • Node located nearest (x, y) stores information about e CS 598ig
GHT: Issues • Load balances event types over nodes but: • Multiple events with same name can overload a node • Comm. Costs for storing an event may be large • Solution: Structured Replication CS 598ig
Structured Replication CS 598ig
Quad Tree Approach • Quaternary Tree: • Each node has 4 children • Each node has 4 histograms summarizing data distribution in each child subtree • Queries only propagate in relevant parts of the tree (pruning) CS 598ig
Quad Tree: Issues • Explicit child pointers required • On storage of new data, update must be propagated up the tree • Every query must originate at tree root • Root bears greater burden! CS 598ig
New Proposal: DIFS • Combine useful features of GHT and Quad Tree: • Geographical Hashing for location • Search hierarchy of histograms aka Quad Tree • Two search criteria: • Spatial location • Value range/distribution CS 598ig
DIFS Design • Load balance over space/value-range • If node X indexes a large area, it will do so for a small value range and vice versa • Similar to having multiple quad trees: one for each value sub-range • Each non-root node appears in bfact trees • Willing to accept somewhat larger storage costs if fairness improves! CS 598ig
Operation Details • Each node has bfact parents • A parent is responsible for a 1/bfact interval of values compared to its children • Geographically Bounded Hash: • MSBs from Bounding Box • LSBs from computed hash • Replication for fault-tolerance aka GHT • Queries directed to minimum covering set of index nodes CS 598ig
DIFS Hierarchy CS 598ig
A Simplified Example [0…3]; Full-N/W [4…7]; Full-N/W 1 2 4 3 [0…7];Q4 [0…7]; Q1 [0…7]; Q3 [0…7]; Q2 bfact = 2 Max Levels = 2 Max Range [0…7] CS 598ig
Simulations • Compared Schemes: • DIFS • Quad Tree • GHT with structured replication • Directed Diffusion • 1024 x 1024 area, 2048 nodes, 25m TX range CS 598ig
Uniform Distribution of Events Fraction of Index Nodes accessed Search Cost (total hops traversed) CS 598ig
Gradient Distribution Fraction of Index Nodes accessed Search Cost (total hops traversed) CS 598ig
Hotspots Search Cost (total hops traversed) Fraction of Index Nodes accessed CS 598ig
Discussion • May not be very suitable if frequency of new data generation is much higher than frequency of queries • In query phase, unicast messages are sent to each relevant index node • Inefficient for queries for large ranges • Future Work: • Dynamic repartitioning in case of skewed distributions • Multi-attribute Indexes • Multiple levels of indirection CS 598ig
Amorphous Computing Abelson et al. Appeared in Communications of the ACM, May 2001 (This presentation uses some figures/images obtained from the Amorphous Computing Webpage and related publications)
Amorphous Computing? • Irregular Lattice structure • hence amorphous • High node density • No synchronization • Static • Nodes are tiny devices (MEMS or cellular) on the lines of current sensor motes
The Basis • Micro-Electro-Mechanical Systems (MEMS) • the integration of mechanical elements, sensors, actuators, and electronics on a common silicon substrate through microfabrication technology (definition from http://www.memsnet.org) • conceptual origins in an 1959 talk by Feynman: “There’s Plenty of Room at the Bottom” • Cellular Computing • building logic gates from living cells
Fundamental Idea • Correct function with high reliability should be possible without precise juxtaposition of component elements or global knowledge and with significant number of component failures • Major Inspiration: Biological mechanisms • How cells execute the genetic code
Growing Point Language • A programming paradigm for amorphous computers • In plants, a growing point represents a site of new growth • Cell differentiation • Branching/Merging • Termination • Exhibited behaviors: Tropism, Sensitivity to substance concentration etc.
Growing Point Language • Intuitively akin to mobile agents • Mobile agents move from node to node, following certain rules, and modifying local state as they pass through • Result is a global pattern • Crucial part is to determine what the rules should be • Claim that any planar graph may be generated using this paradigm if node density is sufficiently high
Growing Point Language • The Growing Point Language formalizes the notion into a programming abstraction • High-level description mapped onto actual specifics of message lifetimes and other rules
More Biological Metaphors • New approach to fault-tolerance • Traditional approach to reliability via redundancy of specific parts has limitations • The key is to allow a mass of cells (some of them defective) to work out a way to function properly • Example: Teramac
Example: Teramac • A Configurable Custom Computer (HP Labs) • Implements user-specified designs • No user knowledge of Teramac architecture required • Requires Teramac to support complex interconnection topologies of gates • Complete automation of mapping of user design to Teramac
Teramac (contd.) • Defect Tolerance • Teramac made of inexpensive parts • Correct operation despite defects • Over 220,000 hardware defects • 100 times faster than a high-end workstation • Fat Tree Architecture • Rent’s Rule Figure from Heath et al., A Defect-Tolerant Computer Architecture: Opportunities for Nanotechnology (Science, 1998)
Teramac (contd.) • Lessons for nano-scale devices: • Possible to build a powerful device using defective parts – compensate through high communication bandwidth • Regularity of structure is not essential; connectivity is! • Work around hardware defects through suitable intelligence in software
Cellular Computing • A possible application area of amorphous computing research • Vision of controlling the function of biological cells using chemical mechanisms • Cells can be made to act as computational/sensor units
Protein Synthesis Basics • Protein = Amino Acid Sequence • Encoded in gene as codons • Codons = triplets of nucleotides A, G, T, C • How Synthesis happens: • DNA sequence transcribed onto mRNA • mRNA transcript used by ribosome for synthesizing protein • mRNA depletes and needs replenishing • Repressor and promoter binding proteins catalyze/inhibit transcription
Cellular Gate Technology • Logic Gates using biological cells • Leverage Protein Synthesis Mechanisms • Electrical Signal ↔ Protein Concentration • Use inhibitor to obtain a cellular inverter (NOT gate) • Non-linear behavior obtained through use of multimers
Cellular Inverter Figure from Weiss et al., Toward In Vivo Digital Circuits
Paradigms from Physics • Conservative Systems • Discrete model quantizes conserved quantity into tokens that are exchanged pair-wise between particles • Difficult to maintain conservation in an error-prone system • Holds only if no loss or duplication (isn’t this reliable transmission semantics?)
Possible Applications • Amorphous Computers may not provide extremely high-speed computing • But represent a great opportunity for programming self-assembling systems for nano-scale circuit fabrication
Discussion • Focus on use of local knowledge • Similar to localized algorithms for sensor-nets? • Activation/Inhibition/Tropism: • Similar notions have been used by the swarm intelligence community • Stigmergy: communication via modification of environment • sematectonic and sign-based
Conclusion • Amorphous Computing is a futuristic vision • A number of exciting ideas • Many seem plausible given recent advances in MEMS, nanotech and gene-sequencing • Some not-so-new ideas too! • However, research in the area is still at a nascent stage • Difficult to predict future success/failure
An Algorithm for Group Formation and Maximal Independent Set in an Amorphous Computer Radhika Nagpal and Daniel Coore
Addressed Problem • Formation of groups and Maximal Independent Sets in an amorphous computer • Assumptions: • Irregular Lattice • Asynchrony • Wireless Communication with possible collisions • No globally unique identifiers for nodes • Significant node failure rate
Proposed Solution • Clubs Algorithm: • Forms groups with a max. diameter of 2 hops • All processors finally belong to some group • All groups are capable of locally routing internal messages
Clubs Algorithm • Each node: • Chooses a random no. M in range [0, R) • Counts down from M • If reaches 0 without hearing any other leadership announcement, sends a recruit message • If hears another recruit message while counting down, marks itself as a follower, and stops countdown • Can continue listening for more recruit messages
Algorithm Operation After algorithm termination Clubs gradually forming
Analysis of Algorithm • With synchronized processors: • Completes in R steps and produces valid groups • Expected no. of conflicts ≤ (davg/(2R))*N davg =avg. node degree N = total node population • Can choose R = α davg , to get expected no. of conflicts ≤ (1/(2 α))*N
Analysis (contd.) • Message Complexity • One message per club formed • No global knowledge required; R may be estimated based on locally perceived degree, else using upper bound on node density • No global IDs needed; can use random nos. to distinguish between nodes in 2-hop range