Network Algorithmics: Design Implementation Principles

ECE 526 – Network Processing Systems Design System Implementation Principles I Varghese Chapter 3

Outline • What is network algorithmics • Clarifying example • Implementation principles • Reflect what we learned

Network Algorithmics • Different with Network Algorithms • Network Algorithmics • Interdisciplinary system approach to streamlining network implementation • Using a design problem to clarify difference

Security Forensics • Identifying a network intrusion (e.g., port scanning) need a period of time (e.g., 10 second) • Suspicious intrusion is recorded as flow state in suspicious table • Flow state is updated by checking each incoming packet from intruder. • Once suspicion table deem a suspicious flow as bad, evidence (all packets from intruder will be reported to forensic system)

Packet Buffering • Packet buffer needed to store all packets (100,000) during 10s for evidence if needed later • Buffer bounded by inserting new packet in the head of queue, and discard at the tail if buffer is full

Evidence Collection • Collecting all packets belonging the deemed attack /flow from 100, 000-packet queue • The collection has to be performed fast enough before the new incoming packet overwrite the evidence

Hashing Table -- Algorithm • Real problem: find all packet belong to flow F quickly • Store pointer to packet belong to flow using linked list • Pointer to the head of linked list is store in hash table. • Advantage: quickly find the malicious flow • Disadvantage: • Maintain a hashing table • Update the linked list of each flow • Read out packets of malicious flow

Algorithmics Approach • Once attack/flow F identified, don’t attempt to immediately identify all packets of flow F • Instead, identify each packet individually as they reach the end of packet queue. • Shift computation time – evaluate lazily

Taxonomy of Principles • P1-P5: System-oriented Principles • These recognize/leverage the fact that a system is made up of components • Basic idea: move the problem to somebody else’s subsystem • P6-P10: Improve efficiency without destroying modularity • “Pushing the envelope” of module specifications • Basic engineering: system should satisfy spec but not do more • P11-P15: Local optimization techniques • Speeding up a key routine • Apply these after you have looked at the big picture

P1: Avoid Obvious Waste • Key Concept: look for ways to avoid doing something, or to avoid doing it multiple times • Example: copying in protocol stacks • In the old days, copying was not a bottleneck • But transmission & processor speeds increased much faster than memory speeds • Today, reading/writing from/to memory is slowest operation on a packet • “Zero-copy” protocol stacks never copy packet data once it is stored in memory by the NIC • Eventually multiple copies became a bottleneck

P2: Shift Computation in Time • Key Concept: Move expensive operations from where time is scarce to where it is more plentiful • P2a: Precompute (= shift earlier in time) • Example: computing a function by table lookup • P2b: Evaluate Lazily (= shift later in time) • Example: network forensics • P2c: Share Expenses (= collect things in time) • Example: batch processing • Examples:

P3: Relax Subsystem Requirements • Key Concept: make one subsystem’s job easier by relaxing the specification (possibly at the expense of another subsystem’s job getting slightly harder) • P3a: Trade certainty for time (= use probabilistic solutions) • Example: random sampling • P3b: Trade accuracy for time (= use approximate solutions) • Example: hashing, bloom filter • P3c: Shift computation in space (= let someone else do it) • Example: fast path/slow path

P4: Leverage Off-System Components • Key Concept: design bottom-up, use hardware • P4a: Exploit locality (= exploit caches) • Note: not always useful (cf. IP forwarding lookups) • P4b: Trade Memory for Speed • Note: this can be interpreted two ways • Use more memory to avoid computation (cf P12) • Example: table lookup • Use less memory to make data structures fit in cache • Example: data compression • P4c: Exploit hardware features • Example: Incremental checksum computation • Examples • Onboard Address Recognition & Filtering: Shift “processing” duty from CPU to NIC

P5: Add Hardware to Improve Performance • Key Concept:Hardware is inherently parallel, can be very fast, and is cheap in volume; consider adding hardware to perform operations that must be done often and are expensive in software • P5a: Use memory interleaving, pipelining (= parallelism) • P5b: Use Wide-word parallelism (save memory accesses) • P5c: Combine SRAM, DRAM (low-order bits each counter in SRAM for a large number of counters) • Examples:

P6: Replace inefficient general routines with efficient specialized ones • Key Concept: General-purpose routines cannot leverage problem-specific knowledge that can improve performance • Example: NAT using forwarding and reversing table

P7: Avoid Unnecessary Generality • Key Concept: Do not include features or handle cases that are not needed. • "When in doubt, leave it out" • Example: RISC, microengine

Reminder • Read Varghese Chapter 3 • NO Class, Wed. 11/5 • on-line quiz

Network Algorithmics: Design Implementation Principles

Network Algorithmics: Design Implementation Principles

Presentation Transcript