460 likes | 593 Views
Building Reliable, Secure and Manageable Substation Communications. Dragan Dokic | CCIE, CISSP, MCSE. Introduction - Experience. Dragan Dokic | President, Summit Energy Tech Focus on utility sector Infrastructure systems management Custom business systems software development
E N D
Building Reliable, Secure and Manageable Substation Communications Dragan Dokic | CCIE, CISSP, MCSE
Introduction - Experience • Dragan Dokic | President, Summit Energy Tech • Focus on utility sector • Infrastructure systems management • Custom business systems software development • 16 years of experience in IT industry • 10 years in utility sector • Managed network operations for PNGC Power [Portland, OR] from September 2002 to October 2011 • Presentation focuses on lessons learned in field network reliability, security and manageability from this experience
Introduction • PNGC’s 2001 – 2011 field network • 92 office, substation and repeater sites at 11 distribution utilities inOregon, Idaho • System mission • Gather real-time load data 24/7 for power scheduling operation in Portland • Support local utility SCADA/AMI/Site Security operations
Areas of Focus Reliability Security Manageability Presentation available for download at summitenergytech.com in the Events section
Reliability – Network Design • Keys to success • Diversity in media • Combine land lines, fixed wireless [private/public], mobile wireless and satellite • Diversity in providers • Local and national • Dynamic Routing [OSPF] • Routers exchange knowledge of local network with neighboring routers • Enterprise grade routers / switches a requirement • Perfect world configuration • Private wired/wireless ‘island’ with two Internet gateways using distinct media and distinct providers
Connectivity overview Backup router Primary router
Link cost overview Primary Backup
Link cost calculation Sub A -> Main Office via Satellite tunnel: 3 + 1 = 4
Link cost calculation Sub A -> Main Office via 900Mhz+DSL tunnel: 1 + 1 + 1 = 3
Open Shortest Path Link cost via Satellite tunnel [4] higher than via DSL tunnel[3]; therefore, packets will traverse 900Mhz/DSL tunnel in normal operation
Normal Operation Open Shortest Path From substation A to Main Office
Normal Operation Open Shortest Path From substation B to Main Office
Link down operation If DSL tunnel is down, packets will traverse satellite tunnel; Sub A Main Office X
Link down operation If DSL tunnel is down, packets will traverse satellite tunnel; Sub B Main Office X
Security – Overview • Wireless link encryption • Function specific VLANs • No default routes!
Wireless Link Encryption • Media device level [e.g. Radio, Modem] • WEP, WPA, WPA2 • Routing device level [e.g. Cisco 891 router] • IPSEC • End device level [e.g. DIGI TS4 port server] • SSL
Security - Wireless Link Encryption[continued] • Most secure option? • Use all three if management overhead is not an issue • Most efficient but secure enough option? • Use routing device site-to-site VPN capabilities • Advantages: • Support for best commercially available security technologies [e.g., AES-256] • Comprehensive change logging capabilities • Standardized configuration throughout the system [less management overhead]
Security – Function Specific VLANs • Define VLAN’s per business function • SCADA, AMI, Security System, Wireless, VOIP, Network Mgmt. • Firewall traffic between VLANs on need-to-access basis • E.g., Prevent personnel attached to substation wireless VLAN to access documentation stored on a server at the main office from accessing recloser controls in the SCADA VLAN • Reliability advantages • Non-critical VLANs [e.g. AMI, security] can be shut down automatically/remotely if link quality is too poor to carry all traffic, but good enough to carry SCADA
Security – No Default Route! • Do not use default routes through service provider-supplied gateways • Define a singlehost route back to the main office, then establish default route through VPN tunnel • This is the most effective method to prevent attacks sourced from the Internet • Always use in conjunction to regular firewall configuration lists [not a substitute!]
Less secure Provider gateway
More secure Provider gateway
Manageability - Overview • Tools – network management systems • Addressing – developing a scheme • Watchdog system – preventing lockout
Manageability– Tools • Network Management Systems [NMS] • Protocols used • SNMP, Syslog, ICMP, HTTP • Applications • PRTG • Solarwinds Syslog
Manageability– Tools [continued] • How to collect data? Push vs. Pull • Pull: Poll devices using SNMP/HTTP/ICMP at regular intervals [e.g., every • Push: Devices send data per defined event triggers • SNMP traps • Syslog messages • What data to collect? • Availability [ping] • Network utilization • Input voltages • RSSI [radio link quality]
Manageability– Tools [continued] • Pull example: • 5 minute SNMP poll of UPS for input voltage • If voltage drops below threshold of 108VAC for a duration of time longer than 5 minutes, an alert will be triggered by NMS [e-mail, text message, event log] • But what if voltage drops for 2 minutes only in between polls? You may not know it even happened. • Push comes to rescue: • UPS sends SNMP trap to NMS as soon as voltage drops below 108VAC • Alert is triggered by NMS when trap is received
Manageability– Addressing • Develop consistent scheme to use system wide • Recommended private range: 10.0.0.0/8 • First octet: same for entire system • Second octet: site ID [e.g. 8=Springfield Sub] • Third octet: business function ID [e.g., 4=AMI] • Fourth octet: device itself [e.g., Collector #1] Subnet Mask [255.255.255.0] 1st octet ‘fixed’ 4th octet = device 3rd octet = vlan/business function 2nd octet = site ID
Manageability– Addressing [continued] • Large network? • Group sites by region using second octet • Allows for address summarization if needed. • Example: • Eastern division region: • 10.64-127.0.0 • Summary address: 10.64.0.0/10 • Western division region: • 10.128-191.0.0 • Summary address: 10.128.0.0/10
Manageability– Watchdog System • General concept • Reboot key remote communications devices if connectivity to central site is interrupted • Benefit • Prevent unnecessary site visits due to • Operator error • Device lock-up [e.g., buggy firmware, heat issues]
Manageability– Watchdog System [continued] • Hardware requirements: • SNMP-capable switched PDU with task scheduling and delayed power cycling command capabilities • Example:APC AP7900 8-port 15A PDU • Software capability requirements: • Centralized command override mechanism using NMS • Send SNMP ‘Set’ to cancel pending power cycling command
Manageability – Watchdog System Example • ‘Delayed’ power cycle schedule is defined on PDU: • Outlets to power cycle: 1,2 [e.g., radio, router] • Frequency: 60 minutes • Command execute delay: 30 minutes • Network management system running at main office sends an SNMP delayed power-cycle command cancel message • Frequency: every 5 minutes • Process • If delayed power cycle cancel command cannot reach the PDU at least one time during the 30 minute reboot delay period, outlets 1 and 2 will be power cycled and communication will (hopefully!) be restored