270 likes | 387 Views
Thoughts and ideas on distributed security. Droplet droplet@kernelchina.org. Topics. None-CP design CP-based design RTC (run-to-completion) Event-driven Parallel vs pipeline Message and RPC Locking and IPC Resource management Session affinity Reorder. None-CP design. IOC1. SPU1.
E N D
Thoughts and ideas on distributed security Droplet droplet@kernelchina.org
Topics • None-CP design • CP-based design • RTC (run-to-completion) • Event-driven • Parallel vs pipeline • Message and RPC • Locking and IPC • Resource management • Session affinity • Reorder
None-CP design IOC1 SPU1 P1 hash1 P2 P3 hash2 IOC2 SPU2 hash3 SPU3 IOC3
None-CP design highlight • Intelligence on IOC • IOC distributes packet by hashing • Simple load balance • Performance depends on traffic pattern • Need resource synchronization between SPUs • Static allocation on SPUs (Inefficient) • Dynamic allocation on SPUs (Need central resource management point or resource synchronization between SPUs)
CP-based design IOC1 fast path SPU1 P1 P2 P3 first path CP IOC2 SPU2 SPU3 IOC3
CP-based design highlight • Intelligence on CP • CP is responsible for • Session distribution and load balance • Resource management • CP is the bottle neck?
RTC (run-to-completion) • FT is RTC (run-to-completion) • It's a big loop, no yield, no scheduling • Watchdog will be triggered if a FT hold CPU 20 seconds on Australia • Methods for FT break and follow up • Timer events (used by session scan, HA cold sync etc…) • User events (used by session SZ etc…) • Timer interrupt would interrupt FT every 10 milliseconds?
Event-driven • FT is event-driven • FT is polling DPQ • Run callback registered for different DPQ • Locking on queues • Locking between events
Parallel vs pipeline • All threads run in parallel • For dedicate path, different stage connected by queues, different threads run in pipeline
Parallel Queue1 Thread1 Queue2 Multi threads on different queues Thread2 Queue3 Thread3 Thread4 Queue4 Multi threads on the same queue Thread5 Thread6
Pipeline Queue1 Queue2 Queue3 Thread1 Thread1 Thread1 Thread1 Same thread on different stages Queue1 Queue2 Queue3 Thread2 Thread3 Thread4 Thread1 Different threads on different stages
Parallel vs pipeline • Small and simple tasks • Pros • Cache friendly • Cons • Latency • Complicate design and programming • Big and complicate tasks • Pros • Easy design and programming • Cons • Unpredictableruntime
Message and RPC • Message between IOC/CP/SPU • Kind of RPC • Reliable? • Serialization? • Message type/Message size • Message driven state machine
Locking and IPC • Spinlock and spinlock enhancement • Spinlock • Busy waiting (no sleep, no timeout) • Get lock in random order • Same priority for all lockers • Lock order • Lock reentrance • IPC between FTs • User event
Spinlock enhancement • Read/Write lock • Concurrent read, exclusive write • Serialization lock • Assign a ticket to locker • Sequence lock • Writer can interrupt reader • TryLock • Timeout on a period of time • RCU • Copy on read, very complicate
Resource management • Central management (Resource on CP) • Session/resource handled by all threads • Session and resource are handled by different threads • Session CP/Resource CP • Sparse management (Resource on SPUs) • Static allocation with dynamic allocation • Resource allocation • Allocate one resource per request • Allocate batch of resources per request
Central management CP threads Need CPU power? session/resource request CP threads session request CP session threads CP resource threads Memory issue! One SPU is special! Waste if there is no resource request! resource request SPU threads resource request session request
Sparse management SPU2/Resource bundle2 SPU1/Resource bundle1 Local allocation Local allocation Remote allocation Remote allocation Remote allocation Local allocation SPU3/Resource bundle3
Sparse management • NUMA? • Remote allocation performance is not predictable • Static resource partition may not fit with SPU session capacity • Complicate remote allocation mechanism
Resource allocation Resource repository Resource repository Request Reply Request Reply Resource client Resource client Local allocation • Allocate one resource per request • Allocate batch of resources per request
Resource allocation • Allocate one resource per request • Cons • Too many messages between resource client and resource repository • Pros • Simple • Allocate batch of resources per request • Cons • Need resource management function on resource client and resource repository • Pros • Reduce messages between resource client and resource repository
Session affinity • Typical case • The packets for one session can be handled by any threads • The threads can handle packets for any session • Session affinity • One thread handle packets for the session exclusively • Different threads can handle different sessions simultaneously • Serialized state machine
Session affinity Queue and return FT Session queue DPQ FT LBT Processing packet Assign seq and queue FT Queue and return Serialization by sequence Session queue FT LBT Serialization by locking and event
Session affinity • Serialization by sequence • Cons • Waste • Pros • LBT simple • Serialization by locking and event • Cons • LBT complex • Pros • Efficient
Reorder FT Out of order processing DPQ FT LBT In order POT In order FT Tag sequence Untag sequence • Global ordering: Tag sequence on receiving • Session based ordering: Tag sequence on matching session • LBT and POT is single thread
Summary • CP is necessary • Real time/ Multi core programming model • Locking is complicate • Resource management is difficult • Application needs to be multi thread aware • Order is a mandatory requirement