370 likes | 385 Views
Challenges managing large-scale wireless networks. Lakshminarayanan Subramanian Courant Institute of Mathematical Sciences New York University Joint work with many others. Management Complexity Ladder. Indoor Wireless Access Point Network Multi-hop indoor wireless networks
E N D
Challenges managing large-scale wireless networks Lakshminarayanan Subramanian Courant Institute of Mathematical Sciences New York University Joint work with many others
Management Complexity Ladder • Indoor Wireless Access Point Network • Multi-hop indoor wireless networks • Outdoor Mesh Networks • Rural Wireless Networks (Long-distance +mesh)
Why is management hard? • Potential causes for performance degradation • External problems • Interference, Channel fluctuations • Network performance issues • Incorrect ETX, Channel assignment. Routing problems • Physical issues • Radio separation issues • Unreliable power (Huge problem in rural wireless) • Software issues + Configuration • Forwarding problems, unexpected packet drops • Mundane problems • Loose pigtail, Card misbehavior, Card stops working
Why is it hard to fix? • Potential causes are huge and interdependent • No back channels (in multi-hop cases) • Measurements vary by the second • Environmental fluctuations • Power fluctuations • Software behavior on wireless boards is not very predictable • Climbing street poles and towers is actually not fun!
Some experiences ROMA: Multi-radio indoor wireless network (Aditya, Jinyang) CitySense: Outdoor wireless mesh network (Matt T, Matt W) WiLDNet: Long-distance WiFi networks (Rabin, Sergiu, Sonesh, Eric, Manuel) WiRE architecture (Matt T, Aditya) 5
Cannot transmit concurrently Inter-path interference Reduce gateway gateway Cannot transmit concurrently Multi-radio mesh promises greater throughput Intra-path interference Eliminate gateway
Physical constraints • Compact nodes few radios per node • Link losses, link variability, external load
Link variabilities Two radios report with diff channel conditions ETX measurements are skewed Channel 1 works very poorly, channel 11 works well!!!
<C1> C1 <C1,C2> <C1,C2> C2 <C2,C3> <C2,C3> <C2,C3> C3 <C3,C4> <C3,C4> ROMA: basic idea Single-radio gateway • Each radio in a multi-radio gateway acts as an independent gateway
Multi-radio route metric 1 2 1 Routing metric must consider worst link Single-radio route metric: 1 2 1 Path throughput is limited by worst link
: average delivery ratio : deviation of delivery ratio : fraction of time channel is busy with external traffic ETT = Conservative metric CETT = Link metric • ETT over-estimates link performance • Link metric should incorporate: • Link variability • highly variable links result in unpredictable throughput • External load
Our Indoor Testbed NSC Geode Processors, 128MB RAM, 1GB Flash Implemented on the Click Modular Router Patched Madwifi 0.9.3.3
Aggregate performance • Setup: 9 UDP flows from 3 gateways to non-gateway nodes ROMA’s median aggregate throughput is 1.4X or 2.1X of alternative designs ROMA is able to utilize more channels to reduce inter-path interference CDF 2 identical channels 1 common, 1 assigned channel ROMA Aggregate throughput (Mbps)
WiFi-based Long Distance Networks • WiLD links use standard 802.11 radios • Longer range up to 150km • Directional antennas (24dBi) • Line of Sight (LOS) • Why choose WiFi: • Low cost of $500/node • Volume manufacturing • No spectrum costs • Customizable using open-source drivers • Good datarates • 11Mbps (11b), 54Mbps (11g)
Routers used: (a) Linksys WRT54GL, (b) PC Engines Wrap Boards, Costs: (a) $50, (b) $140 AirJaldi Network • Tibetan Community • WiLD links + APs • Links 10 – 40 Kms • Achieve 4 – 5 Mbps • VoIP + Internet • 10,000 users
Routers used: PC Engines Wrap boards, 266 Mhz CPU, 512 MB Cost: $140 Aravind Eye Hospital Network • South India • Tele-ophthalmology • All WiLD links • Links 1 – 15 Kms long • Achieve 4 – 5 Mbps • Video-conferencing • 3000 consultations/month
New World Record – 382 Kms Pico El Aguila, Venezuela Elev: 4200 meters
Overall Impact • Both networks financially sustainable • 50000 patients/year being scaled to 500000 patients/year • Over 30000 patients have recovered sight
Experience with WiLD Networks • In the field, point-to-point performance is bad • On a 60km link in Ghana • We get 0.6 Mbps TCP vs 6 Mbps UDP • On a relay (single channel) • We get only 2 Mbps TCP
B A Problem: Propagation Delay • Large propagation delay high collision probability
Design Choices for WiLDNet • Use Sliding Window flow control • 802.11 MAC ACKs disabled • Packet batches sent every slot • Slot allocation determined by demand • Replace CSMA with TDMA on every link • Alternate send and receive slots
B B B 1 1 1 A A A 2 2 2 C C C Inter-Link Interference Simultaneous Send Simultaneous Receive Send & Receive • 12dB isolation • Disable CCA
A B I Channel Loss: From external traffic • Strong correlation between loss and external traffic • Source (A) and interferer (I) do not hear each other
Sustainability Challenges • Bad quality grid power • Higher component failures, more downtimes • Limited local expertise • Local operation, maintenance, and diagnosis difficult • Lack of alternate connectivity • Complicates remote diagnosis and management • Remote locations • Traveling is difficult and infrequent (often once in 6 months)
Poor Quality Power Voltage Range Number of Instances seen over 6 weeks • Spikes and Swells: • Lost 50 power adapters • Burned 30 PoE ports • Low Voltages: • Incomplete boots • HW watchdog fails • Frequent Fluctuations: • CF corruptions • Battery Damage
>90% of faults are power-related HW Faults Hardware Faults at Aravind (in 2006) *Conservative Estimate
SW Faults Software Faults at Aravind (in 2006) *Conservative Estimate
Solutions • Power 1.1 Low Voltage Disconnect 1.2 Low-cost Solar Power Controller • Data Collection and Monitoring • Alternate Network Entry Points • Recovery Mechanisms • Safe Software
Power: Low Voltage Disconnect • Low Voltage Disconnect Circuit (LVD) • Disconnect load at low voltage • Prevent battery over-discharge and hung routers • Without LVDs, roughly 50 visits per week for manual reboots at AirJaldi • Off-the-shelf LVDs oscillate too much • Too many automatic reboots • We designed new LVD circuit with better delay • No more manual visits or reboots!
Power: Low-cost Solar Power Controller • Tackle spikes, swells and enable power at remote sites • Features • PPT (peak power tracking) => 15% more power draw • LVD + trickle charging => Doubles battery life • Voltage regulator => No spikes and swells • Power-over-Ethernet => Remote Mgmt • $70 (compared to $300 commercial units) • Have not lost any routers yet in 1 year
Jan’06 – Jun’06 Jul’06 – Dec’06 Jan’07 – Jun’07 Jun’07 – Dec’07 2007: 5 more clinic links Operational Results Our support Migration at Aravind Aravind Local Vendor Maintenance Management Installation Equipment Supply
Questions? Thank you!