240 likes | 417 Views
SystemClick – A Domain-specific Framework for Early Exploration Using Functional Performance Models. Christian Sauer*, Matthias Gries, Hans-Peter Löb Access Communications Solutions, Infineon Technologies, Munich, Germany * Now with Cadence design systems in Munich sauerc@cadence.com.
E N D
SystemClick – A Domain-specific Framework for Early Exploration Using Functional Performance Models Christian Sauer*, Matthias Gries, Hans-Peter Löb Access Communications Solutions, Infineon Technologies, Munich, Germany * Now with Cadence design systems in Munich sauerc@cadence.com Anaheim, June, 11th 2008
Broadband Access Networks Metro Ethernet Access MAN/Metro Access GB-Ethernet/Ethernet over Sonet T/E, SHDSL ADSL, VDSL CustomerPremises Equipment • Broadband access xDSL / ATM / Ethernet / E1-T1 • Home Gateways and – routers • Includes wlan protocols IEEE 802.11a/b/g / e / n … • Protocol interworking, traffic aggregation, Quality-of-Service • Diverse protocols and changing per-packet functions
Objectives • Products for wireless Accesspoints / Homegateways • Based on flexible and scalable packet processing platform • Support for current and future home networking protocols • Easy-to-program for customers yet cost-efficient • Requires careful application-driven platform development • Protocol timing is part of specification • Precise performance estimation • Early in the design process • Quantitative evaluation of alternatives • What do I need to do in hardware to meet requirements? • What can be done in (should be left to) software for flexibility?
Outline • Click model of IEEE 802.11x access points • SystemClick framework – performance simulation of Click models in SystemC • Exploration results – fully flexible single and dual CPU targets
… … Air Source Sink Tx1 Rx1 Medium busy push pull push pull Modeling Wireless Protocols with Click • Framework for composing packet processing applications • Domain specific declarative language, widely used • Elements process & pass packets and form directed task graph • Modular, extensible, implementation independent, and executable • Click extensions • Flow of control information – token represents state • Non-packet data types (token, symbols) – mapped to Click packets
2 1 1 3 3 3 chain busy ACK/CTS Medium busy scheduled packet Queue rate selection DCF Chain busy SetDuration 2 2 chain busy ACK/CTS Medium busy scheduled packet Queue rate selection DCF Chain busy PaintSwitch ACK/CTS … chain busy Medium busy scheduled packet Queue WifiFragment WifiSeq SetRTS ProbeTxRate SetCRC32 WifiEncap WifiFragment WepEncap WifiSeq WifiSeq WepEncap WepEncap WifiFragment rate selection DCF Chain busy PrioSched Classifier BeaconSource Scanner &Tracker Tee Classifier UpdateNAV GenCTS GenAck BC-Filter CheckCRC32 Classifier WiffiDefrag WiffiDecap WiffiDefrag WiffiDecap WiffiDefrag WiffiDecap Paint(3) WiffiDefrag HostEtherFilter WepDecap PaintSwitch PaintSwitch WifiDupeFilter WifiDupeFilter WifiDupeFilter WifiDupeFilter Model Overview IEEE 802.11 a/b/g (+e) AccessPoint … Transmit SharedChennel Host Air / Phy Receive Carrier Sense Beacons & Probes Other Mgmt frames Ack/Cts … RTS … control managment unicast data data multicast data busy
1 1 A) Outbound transaction 3 3 3 chain busy ACK/CTS Medium busy scheduled packet Queue rate selection DCF Chain busy SetDuration 2 2 chain busy ACK/CTS Medium busy scheduled packet Queue rate selection DCF Chain busy PaintSwitch WepEncap WifiFragment ProbeTxRate WepEncap WifiFragment WifiSeq WifiSeq WifiFragment WifiEncap SetRTS WifiSeq SetCRC32 WepEncap PrioSched Classifier B) Inbound transaction BeaconSource Scanner &Tracker Tee Classifier UpdateNAV GenAck GenCTS BC-Filter CheckCRC32 Classifier WiffiDecap WepDecap WiffiDefrag WiffiDecap WiffiDecap WiffiDefrag HostEtherFilter WiffiDefrag Paint(3) WiffiDefrag PaintSwitch PaintSwitch WifiDupeFilter WifiDupeFilter WifiDupeFilter WifiDupeFilter 11a/b/g (+e) AccessPoint Model … 2 ACK/CTS … chain busy Medium busy scheduled packet Queue rate selection DCF SharedChennel Chain busy Carrier Sense Beacons & Probes Other Mgmt frames Ack/Cts … RTS … control managment unicast data data multicast data busy
1 1 A) Outbound transaction 3 3 3 chain busy ACK/CTS Medium busy scheduled packet Queue C) Outbound acknowledge rate selection DCF Chain busy SetDuration 2 2 chain busy ACK/CTS Medium busy scheduled packet Queue rate selection DCF Chain busy PaintSwitch WifiSeq SetRTS WepEncap WifiFragment WepEncap WifiFragment WifiSeq WepEncap WifiFragment WifiSeq WifiEncap ProbeTxRate SetCRC32 PrioSched D) Inbound acknowledge Classifier B) Inbound transaction BeaconSource Scanner &Tracker Tee Classifier GenCTS GenAck BC-Filter UpdateNAV CheckCRC32 Classifier WepDecap WiffiDefrag HostEtherFilter Paint(3) WiffiDefrag WiffiDefrag WiffiDecap WiffiDefrag WiffiDecap WiffiDecap PaintSwitch PaintSwitch WifiDupeFilter WifiDupeFilter WifiDupeFilter WifiDupeFilter 11a/b/g (+e) AccessPoint Model … 2 ACK/CTS … chain busy Medium busy scheduled packet Queue rate selection DCF SharedChennel Chain busy Carrier Sense Beacons & Probes Other Mgmt frames Ack/Cts … RTS … control managment unicast data data multicast data busy
E) Outbound RTS/CTS 1 1 A) Outbound transaction 3 3 3 chain busy ACK/CTS Medium busy scheduled packet Queue rate selection DCF Chain busy (4) CTS SetDuration (1) RTS 2 2 chain busy ACK/CTS Medium busy scheduled packet Queue rate selection DCF Chain busy PaintSwitch WepEncap WifiSeq SetRTS ProbeTxRate WepEncap WifiFragment SetCRC32 WifiSeq WifiFragment WifiFragment WifiSeq WifiEncap WepEncap (3) CTS PrioSched (2) RTS F) Inbound RTS/CTS Classifier B) Inbound transaction BeaconSource Scanner &Tracker Tee Classifier BC-Filter GenAck UpdateNAV GenCTS CheckCRC32 Classifier WiffiDefrag WepDecap WiffiDefrag HostEtherFilter WiffiDecap WiffiDecap Paint(3) WiffiDecap WiffiDefrag WiffiDefrag PaintSwitch PaintSwitch WifiDupeFilter WifiDupeFilter WifiDupeFilter WifiDupeFilter 11a/b/g (+e) AccessPoint Model … 2 ACK/CTS … chain busy Medium busy scheduled packet Queue rate selection DCF SharedChennel Chain busy Carrier Sense Beacons & Probes Other Mgmt frames Ack/Cts … RTS … control managment unicast data data multicast data busy
SIFS SIFS DIFS & backoff SIFS RTS Data frame STA AP CTS ACK time Atomic Data Frame Transfer • Protocol timing is part of specification • e.g., the extremely tight SIFS deadline of 16 micro seconds • Timing correct protocol interaction • Precise performance estimation required
Outline • Click model of IEEE 802.11x access points • SystemClick framework – performance simulation of Click models in SystemC • Exploration results – fully flexible single and dual CPU targets
The Y-chart using SystemClick System Function Model Architecture Model Click resource description Application Clicktask graph Application Platformresources Simulation SystemC Mapping Click Annotated Click model Codegen SystemClick SystemC model Perf DB Simulation Profiling SystemClick
PrioSched Classifier FromEth WifiFragment WifiEncap ToEth Representation of an Application-Architecture Mapping Computation Resources RIO RCPU RCoP RCPU Click Application frameinput frameoutput … Comm.Resources RBus RBus
[DAC‘05] Click Elements ClickSource Simulation/Execution Linux/OSAuxiliaries Click Engine CRACC Elements Netlist ElementConfiguration TargetAuxiliaries X-Compile ClickCRACC Profiling Executable on emb. processor(s) Characterize SW elements SystemClick– SystemC based Click simulation Function + platform mapping annotation Click Elements ClickSource Click Engine ClickSystemC CraccElements GeneratedSYSTEM-C PerformanceEvaluation Function sc_compile PerformanceDatabase Timing Timing precise, Functionally correct Simulation { Ti,Rj}
Timer FromSysC FromSysC ToSysC ToSysC B Click task chains SystemClick Wrappers for Packet IO and Timers run push A … … C wrapper_push() // sc_thread while in_port.avail() m_delay = 0; rm->lock( id ); // blocking in_port.nb_read( p ); update( &m_delay, os_pre ); // os overhead anno push( p, &m_delay ); // run task chain wait( m_delay ); // synchronize rm->unlock( id ); wait(); … pull Click SystemC lock/unlock lock/unlock lock/unlock Resource PerformanceDatabase
Outline • Click model of IEEE 802.11x access points • SystemClick framework – performance simulation of Click models in SystemC • Exploration results – fully flexible single and dual CPU targets
Excluding instructions for CRC and crypto Outbound Inbound Max 1498 Byte Typ 550 Byte IMix 330 Byte Min 36 Byte Instruction Counts per MAC Execution Path 3500 3000 2500 2000 1500 1000 500 0 A B C D E F G H OutboundData InboundData OutboundAcknowledge InboundAcknowledge Outbound RTS /Inbound CTS Inbound RTS/Outbound CTS Generate Beacon Receive Beacon Data frames Control frames Management frames
Outbound Inbound NAVupdate SetDuration DCF Dequeue Enqueue WifiFragment WifiSeq 912 PaintSwitch ProbeTxRate SetTXRate SetRTS WifiEncap Instruction Counts per MAC Execution Path 3500 • Excluding instructions for CRC and crypto 3000 Max 1498 Byte Typ 550 Byte IMix 330 Byte Min 36 Byte 2500 2000 1500 1000 500 0 A B C D E F G H OutboundData InboundData OutboundAcknowledge InboundAcknowledge Outbound RTS /Inbound CTS Inbound RTS/Outbound CTS Generate Beacon Receive Beacon Data frames Control frames Management frames
797 758 663 638 569 556 487 474 417 379 348 278 215 87 102 76 66 65 54 400 MHz 500 MHz 600 MHz 700 MHz 800 MHz 9 44 MAC Throughput vs. Packet Length, CPU Frequency 800 • Static analysis for back-2-back outbound data frames (most cycles) • Pessimistic, does not consider less cycle-consuming cases (e.g. ackn) and inter frame gaps 700 600 500 Throughput [Mb/s] 400 300 200 100 0 Min [36] iMix [330] Typ [550] Max [1498]
SIFS = 16µs Inbound frame Outbound response 2µs MAC 12µs RX PHY 2µs RF 16µs Context 20µs Frame Data 1,365 3,762 1,482 4,089 4,614 2,037 0 2 4 6 [µs] 10 Critical MAC Response Time Analysis • Reception of frames may require response within SIFs time, 400 MHz CPU, crc in hardware 28 DATA (B) – Acknowledge (C) (F) RTS – CTS CTS (E) – DATA (A) 5,679
Response Time Distribution • Single CPU at 400 MHz 10000 Single-core ACK Frame Context Deadline Single-Core CTS Single-core DATA 1000 Occurrences 100 10 1 0 20 40 60 80 > 100 Response Time [us]
2 1 1 3 3 3 chain busy ACK/CTS Medium busy scheduled packet Queue rate selection DCF Chain busy 2 2 chain busy ACK/CTS Medium busy scheduled packet Queue rate selection DCF Chain busy PaintSwitch ACK/CTS … chain busy Medium busy scheduled packet Queue WifiFragment SetCRC32 SetRTS WepEncap ProbeTxRate WifiEncap WifiFragment WifiSeq WifiSeq WifiSeq WifiFragment WepEncap WepEncap rate selection DCF Chain busy PrioSched Classifier BeaconSource Scanner &Tracker Tee Classifier BC-Filter GenAck UpdateNAV GenCTS CheckCRC32 Classifier WiffiDefrag WiffiDecap WiffiDefrag WepDecap WiffiDecap Paint(3) WiffiDefrag HostEtherFilter WiffiDefrag WiffiDecap PaintSwitch PaintSwitch WifiDupeFilter WifiDupeFilter WifiDupeFilter WifiDupeFilter Model Overview IEEE 802.11 a/b/g (+e) AccessPoint … SetDuration SharedChennel Host Air / Phy Carrier Sense Beacons & Probes Other Mgmt frames Ack/Cts … RTS … control managment unicast data data multicast data busy CPU 2 CPU 1
Response Time Distribution after Refinement • Single CPU at 400 MHz, refined: dual CPU at 150/200 MHz 10000 Single-core ACK Frame Context Deadline Single-Core CTS Single-core DATA 1000 Refined ACK Refined CTS Refined DATA Occurrences 100 10 1 0 20 40 60 80 > 100 Response Time [us]
Conclusions • System model is crucial for development of application-specific architectures • Captures function and requirements; modular, hardware-independent, and executable • Click framework is natural for 802.11 wireless MAC protocols • SystemClick enables performance simulation of Click models in SystemC • Quantitative performance evaluation for early design exploration • Exact timing, full system function, resource sharing and arbitration effect • Programmable IEEE 802.11 MAC platform • Control frame processing and protocol states can be handled in software! • Coprocessor for CRC required, for security beneficial • Next steps include • Apply to 11n applications (more complex protocol processing) • Improve tool performance (currently 2-3 orders of magnitude better than ISS)