300 likes | 429 Views
Robust NEST Systems Minitask Report. Lockheed Martin MIT Mitre OSU UC Berkeley University of Virginia Vanderbilt Santosh Kumar @ OSU. December 2003. Robustness Questions. Given all NEST demos fielded this year, we are in a position to consider:
Robust NEST Systems Minitask Report Lockheed Martin MIT Mitre OSU UC Berkeley University of Virginia Vanderbilt Santosh Kumar @ OSU December 2003
Robustness Questions Given all NEST demos fielded this year, we are in a position to consider: • What robustness properties can be claimed of extant NEST (middleware and) systems? • What robustness issues have been observed by various teams that need to be resolved? • What low technology/cost strategies would defeat/diminish robustness of NEST systems?
List of Sources • Field Experiment robustness experience reports (centred on Mica2 platform): • MIT: Fort Benning Grid • OSU: A Line in the Sand • UVa: Waking Up Big Brother • Vandy: Shooter Localization (includes many suggestions) • Lockheed Martin categorization of various failure scenarios that need to be handled by applications • Mitre evaluation of various short-, medium-, and long-term robustness issues
Robustness MiniTaskRed Team Perspective Kenneth W. Parker, Ph.D. December 17, 2003
When I said, “The Red-Team should ‘shot down’ the Blue-Team’s proposal”, I didn’t have paint-ball guns in mind. Our task was to searching out robustness problems. We focused on transition and deployment impact. The program’s success metric is transition. Did not design an anti-netted-sensor systems. We did think thought about counter-measures. Don’t think this is the top problem. Maybe 2 years from now. Our Job
Multiple Scales of Consideration • When viewed on a short time-scale robustness issues have a different character than when viewed on a long time-scale. One (of many) taxonomies: • Flaws and implementation issues. • Engineering issues. • Technology issues. • Fundamental science issues.
Timer Module (9): UCB timer module can exhibit up to 10% error. May have been fixed ??? UCB clock is dead-on. VU timer changes the semantics; less general. VU clock is finer grained than UCB clock (a rare problem). UO timer not yet available. Timers are blocked by tasks. Almost impossible to be sure no task will be running when the timer event occurs. Antennas (9): Monopole antenna with no ground plane. Antenna connectors require special tools (easily broken). We think there’s also an impedance mismatch. MICA2 MAC layer (8): Abstracts away needed timing control. Have retry bugs been fixed ??? Anti-Aliasing Filter (8): Acoustic sensor can’t be used without an anti-aliasing filter. Flaws and Bugs
Flash (8): SRAM is about the right size for stack and local variables. Everything else should go in non-volatile memory. Non-volatile memory uses negligible power when sleeping. Can be 100 or 10k times bigger for same power level. Assuming low duty cycle. External flash is too slow. There exists a “fast” write method; but not used. External flash is also too small. Test Suit (7): Some better than others. Few showed up with adequate test suits. Fault Identification (6): Need better methods of identifying faults. Hardware and software faults. Flaws and Bugs (cont.)
Wireless Reprogramming (6): Even the single-hop version corrupts the memory, too often. No wireless recovery possible. GenericBase chokes on high volume of data. Does not support the 38.4 kbps transfer rate of MICA2. Misc (5): Degaussing circuit. Wind guard on the microphone. Temperature sensitivity. General maintenance rate. 5% per day? Battery management (4): I have a drawer full of batters that are about 15% used. Don’t work in motes, but do work in most other devices. Mote battery death occurs when voltage drops. Software measurement of remaining capacity (not useful). Programming Boards (3): Burn motes if turned on and external power is connected. Old board often fails to reprogram. Flaws and Bugs (cont.)
Engineering Challenges Sensor Range (9): • Disruptive tech. usually offers something fundamentally new, in exchange for lower performance according to the legacy metrics. • The “legacy metric” seems to be, “sensing area per dollar”. • We will be allowed a higher cost per area if we offer new capabilities. • This is good; since bigger nodes tend to be cheaper per area. • Similarities to Grosh’s Law in computer architecture. • However, jumping to a 3 m range is too big a jump.
Sensor Range (cont): Concealment needs limit node size per coverage area. Mica2 ~20 sq cm. One every 15 m might be “lost” in the environment. One every meter easily found. Useful area ratios 1e5 to 1e6. Sensor range must be greater than average density (~ 1.3x). Engineering Challenges (cont.) Density requirement stemming from concealment criteria.
Synchronization Metric (9): 30 sec to achieve 8 μs sync. Drifts 30 μs every sec. Good for ~1/4 sec. Assuming ±8 μs drift. i.e., total error ±16 μs. For many synchronization models, accuracy is proportional to synchronization rate. i.e., over some region. Desirable metrics are: Accuracy per duty cycle. Range of applicability. Alternative metrics. e.g. if common model doesn’t apply: Accuracy at 0.5% duty cycle. Duty cycle at 1 ms accuracy. Engineering Challenges (cont.)
Engineering Challenges (cont.) Flash-Based Data Store (8): • Read cost comparable to SRAM. • Not with the 3-bit serial interface used in the Mote. • Write cost ~6 times read cost. • Erase cost ~400 times read cost. • Cost per byte can be made low with larger blocks. • Well known secret: use log-structures files systems. • Always write to end of log. • Update by writing new copy. • Clean blocks before erasing. • Well suited for garbage collectors.
Service Composition Model (8): Developed and demonstrated in isolation can’t be combined. Key problem is timing conflicts. Timing knowledge is implicit. May have herd a viable standard: Use pseudo-random timing. Low duty cycle service. Timing collisions result in clean event loss. All servers can handle occasional event loss. If this is “the” composition method it’s underused. LPI and LPD (7): All active signals must be below the noise floor (at receiver). Must track noise level. SNR determines the ratio of signal range to discovery range. Typically need coding gain plus SNR to be about +6 dB. Detestability 1 spot in 1000 might need 36 dB coding gain. 1 s interval; 17 min. per node. Engineering Challenges (cont.)
Packaging (7): Concealment. Sensor/environment interfaces. Hydrophones. Better Non-Volatile Memory (6): Several new non-volatile memory technologies. Ferroelectric memories. Magneto-resistive memories. Ovonic unified memories. Ideal for low-duty cycle designs. Debugging Harness (4): Useful for development; not used in deployment. Distributed debugging may require far higher comm. rate than the actual application. Development cycles use far more power than deployment. Engineering Challenges (cont.)
Over-the-air Programming (10): Efficient reliable multi-cast. OS-style security. An errant program should not be able to prevent loading. Dynamic linking and loading. Make incremental structure explicit rather than trying to discover it after the fact. Extra finer grained. Multiple levels of security: Factory approved loads. Platforms (third party code). Over-the-air Programming (cont). Platform grade security: Protection from live-lock. Protection from dead-lock. Protection from corruption. May require preemption? Issue will be around for years. Phenomenal progress. Good enough for FY04. We’re far from commercial. Technology Challenges
Almost Always Off Comms (9): Tends to violate traditional comm. assumptions. Complex trade space. Higher power routs may be lower latency. Wakeup rate vs. latency. Different tasks may require different walkup rates. Emergent scheduling vs. centralized scheduled. Randomized schedules. Combining multiple services yields complex schedules. Different states will require different wake up rates. Node Localization (8): Must be very robust. Multipart is key problem. Improved ranging. Not sure it’s really this hard. Technology Challenges (cont.)
New Sensor Modes (IFF) (7): What is the best way to detect people? Nature suggests not sound. Really want electronic olfactory. Are there any senor mode for identifying combatants. Which environments require or benefit from proximal sensing. What quantities are mot useful in more congested environments. Chem., bio., speech, power usage, civilian flight, … Tragedy of the Commons (6): No incentive to be a responsible user of shared resources. No enforcement. Not even a widely agreed upon definition of what is fair use. Want better wealth distribution, but not communism. Technology Challenges (cont.)
Byzantine Behavior Model (6): Unavoidable: Some nodes jabber continuously. Some sensors, “See ‘reds’ under their beds” In a real deployment a few nodes may be compromised. Need mixed probabilistic and worst case analysis framework. e.g., detection and tracking. Also need robustness with respect to event loss. Would greatly improve the viability of security. Technology Challenges (cont.)
Technology Challenges (cont.) Small Antennas (5): • We build a proper mono-pole antenna (per user manual). • It worked great. • In almost all environments. • Of course, it’s useless. • Building small antennas is hard. • Need to adapt to near-field environment (e.g., loading). • Incentive for longer wavelength (e.g., foliage). • Software adjustable antennas. • High dielectric antennas. Small Antennas (5): • We build a proper mono-pole antenna (per user manual). • It worked great. • In almost all environments.
Signal Processing Power (8): For a technology family a joules-per-bit-op are nearly constant over a wide range of node size. Key the vision of small nodes. Given a fixed energy supply, not much incentive to use a more powerful node. Mica2 gets ~3GbOPJ. State of the art ~20GbOPJ. Not ~300 GbOPJ. Moore’s law helps this metric slowly (comported to intuition). Doubles ~4 to 6 years. Signal Processing Power (cont): Big wins can be had. ASICS are typically 400x more efficient. Fully custom designs may be 1000x more efficient. Requires use of non-GPP. DSPs 4 to 12x. PolyMorphic computing. FPGAs 100x. The alternative is to use complex signal processing algorithms. Clearly part of the vision. May not be enough. Fundamental Scientific Issues
N-Log-N (8): Taxonomy of scaling: O(N); centralized/monolithic. O(sqrt(N)) or O(cbrt(N)); non-scalable. O(N Ln(N)); quasi-scalable. O(N); absolutely scalable. N-Log-N (cont.): Not going to build 13 different sized motes. But we could built configurable motes. SDR allows you to vary the comm. range. You can vary the clock rate. You may even be able to reconfigure the CPU. Use a sentry-like service to vary the power rate. Still need 2 or 3 types of hardware, but not 10 to 20. Fundamental Scientific Issues (cont.) N-Log-N (8): • Taxonomy of scaling: • O(N); centralized/monolithic. • O(sqrt(N)) or O(cbrt(N)); non-scalable. • O(N Ln(N)); quasi-scalable. • O(N); absolutely scalable. • However, can’t implement a spanning tree with motes. • Comm. range limits size. • N ln(N) implies Log_2(N) layers. • i.e., 10 to 20 layers (not 2).
Programmable Analog Triggers(7): Negligible power, wakeup-triggers can be build for most <sensor, target, app> triplet. Allow lower duty cycles. Simply software; no polling of environment. However, such triggers would have to be configurable in-situ. Need the analog counterpart to FPGAs. Signal Processing Methods (7): Legacy signal processing methods assume: Far-field. Essence of problem is extracting signal in low SNR. Abundant computation. Next generation of signal processing will need to assume: Near-field. Special structure of signals are observable and useful. Essence of problem is finding the few high SNR signals. Computation is battery. Fundamental Scientific Issues (cont.)
Non-Radio Comms (6): Short range and low bandwidth may not favor RF comm. Acoustic comm. may work. E-field comm. will work. Laser comm. IRDA. Chaos Theory (5): A lot of work; not many answers. Controlling emergent behavior will eventually be a critical problem. Fundamental Scientific Issues
Laundry Lists Affect • A major problem with “top-10” lists is they invite a “Laundry List” response from the audience. • I’m not trying to create check-list; I’m suggesting priorities.
Frequently Asked Questions Appendix I
Disruptive Technology 101 • Positive feedback exists in technology adoption. • Sales volume -> lower costs -> sales volume. • If the feedback is strong enough, the timing of the technology transition becomes chaotic. • Sensitive to such small events as to approach randomness. • New technology may be extremely difficult. • The P4 design team was bigger than the Manhattan Project design team. • The pace of change is fast. • It’s not always clear which technology will win. • The new technology may be in an unrelated field. • Thin film disks required replacing lots of MEs with EEs. This may be why technologists fail; it’s not why (properly managed) technology companies fail.
Disruptive Technology 101 (cont.) • Most abrupt changes in technology are not disruptive. • Most of the time the leader in the old technology is the first-mover and the eventual leader in the new technology. • The disruptive transitions occur when performance outpaces customer’s needs. Driving force is that Moore’s law outruns any sensible growth in demand. • Successful customers and companies anticipate the sustaining changes. • Disruption occurs when the “low-tech” solution wins. • New tech. Under-performs, but has other advantages. • Usually fundamentally different advantages. • New market becomes large and subsumes old market.