390 likes | 637 Views
November 19-20, 2001, Tampere, Finland. Reconfigurable Computing Architectures and Methodologies for System-on-Chip Monday, November 19, 10:15 - 11:00 hrs. Reiner Hartenstein University of Kaiserslautern. if you use part of it, please, quote me and e-mail me to:. downloadable “handout”.
E N D
November 19-20, 2001, Tampere, Finland Reconfigurable Computing Architectures and Methodologies for System-on-Chip Monday, November 19, 10:15 - 11:00 hrs. Reiner Hartenstein University of Kaiserslautern
if you use part of it, please, quote me and e-mail me to: downloadable “handout” • Viewgraphs downloadable from: http://www.fpl.uni-kl.de /staff/hartenstein/lot/Tampere01.ppt • Paper dowloadable from: http:// www.fpl.uni-kl.de /staff/hartenstein/lot/Tampere01.pdf 2
FPL 2002, La Grande Motte (Montpellier, France), Sept. 2 – 4 Laboratoire d‘ Informatique, de Robotique et de Microélectronique de Montpellier de Montpellier Conferences on Reconfigurable Logic • FCCM, FPGA (founded 1992), and FPL (founded 1991 at Oxford, UK): The International Conference on Field-programmable Logic and Applications • topic adoption by congresses: ASP-DAC, DAC, DATE, ISCAS, SPIE …. http://www.lirmm.fr/fpl2002/ Paper Submission deadline : 15th March 2002 3
fine grain coarse grain >> Introduction • Introduction • FPGA boom • Conclusions & Future Developments • Coarse Grain Architectures • Programming rDPAs http://www.uni-kl.de 4
Revenue [Kean] / month Update 2 Update 1 reconfigurable Product Product with download ASIC Product Time / months 1 10 20 30 The Impact of Reconfigurable Logic • Reconfigurable platforms bring a new dimension to digital system development and have a strong impact on SoC design. • A rapidly growing large user base of HDL-savvy designers with FPGA experience. • Flexibility supports spin-around times of minutes instead of months for real time in-system debugging, profiling, verification, tuning, field-maintenance, and field upgrades • A New Business Model (in-field debugging and upgrading ... ) • A Fundamental Paradigm Shift in Silicon Application 5
“Mainstream Silicon Application is switching every 10 Years” Makimoto’s Wave “The Programmable System-on-a-Chip is the next wave“ standard µproc., memory 2007 1967 1987 LSI, MSI reconfigurable 1957 ASICs, accel’s 1977 1997 custom Published in 1989 The History of Paradigm Shifts 2ndDesignCrisis 1stDesignCrisis TTL 6
hardwired procedural programming structural programming 4th wave ? ? rDPAs ? Hartenstein’s Curve algorithm: variable algorithm: fixed algorithm: variable Tredennick’s resources: variable resources: fixed resources: fixed Paradigm Shifts How’s next Wave ? standard FPGAs 2007 2007 1967 1987 1957 1977 1997 custom no further wave ! 7
Repeat Success Story by new Machine Paradigm ! Software Industry’s Secret of Success standard µproc., memory TTL 2007 1967 1987 LSI, MSI reconfigurable 1957 ASICs, accel’s 1977 1997 custom The Impact of Makimoto’s Paradigm Shifts Dr. Makimoto: FPL 2000 keynote Procedural personalization via RAM-based Machine Paradigm structural personalization: RAM-based before run time Personalization (CAD) before fabrication 8
>> FPGA boom • Introduction • FPGA boom • Coarse Grain Architectures (rDPAs) • Programming rDPAs • Conclusions & Future Developments http://www.uni-kl.de 9
configurable logic blocks (CLBs) L L L L L L L L L L L L longlines L L L L L L What is an FPGA ? reconfigurable interconnect fabric S S single-length lines S = Switch Box L = Logic Block S S double-length lines Xilinx XC400E 10
Xilinx Lattice 42% 15% Actel soft IPs 6% Altera total: $3.7 Bio 37% Top 4 PLD Manufacturers 2000 Configware Top 4 FPGA Manufacturers 2000 • "pre-fabricated" components and IP reuse for PLDs • killing the ASIC market • improved design flow & libraries • PLD vendors provide libraries to support their products • [Dataquest] > $7 billion by 2003. • fastest growing semiconductor market segment • soon reach 50 million system gates / Chip • FPGAs going into every type of application – also SoC 11
Place and Route . . Schematics/ HDL Netlister Netlist Bitstream [S. Guccione] [S. Guccione] HLL Compiler Compiler HLL [S. Guccione] Compiler User Code [S. Guccione] HLL Compiler Executable Away from complex design flow from HDL to HLL Embedded CPU: Configware / Software Co-design is commonplace use CPU for congfiguration management supporting ....dynamically reconfigurable (RTR) 12
Emerging separate EDA software market (comparable to compiler / OS market in computers) Configware as the Key Enabler • Design productivity and quality by configware libraries (soft IP cores) from various application areas. • Xilinx AllianceCORE & Reference Design Alliance et al. • Top FPGA vendors Currently the key innovators • Growing no. of independent configware houses (soft IP core vendors) and design services • Cadence, Mentor, Synopsys just jumped in. 13
>> Coarse Grain Architectures for detailed overview see proceedings • Introduction • FPGA boom • Coarse Grain Architectures (rDPAs) • Programming rDPAs • Conclusions & Future Developments http://www.uni-kl.de 14
area used by application L L L S S L L L resources needed for reconfigurability S S L L L Why coarse-grained ? Reconfigurability Overhead partly for configuration code storage “hidden RAM” not shown 15
XPU family (IP cores): PACT corp., Munich CALISTO: Silicon Spice* CS2000 family: Chameleon Systems MECA family: Malleable* flexible array: MorphICs ACM: Quicksilver Tech CHESS array: Elixent FIPSOC: SIDSA *) bought Commercial rDPAs XPU128 16
Select mode, number, width of NNports Select Function Repertory 16 8 32 rout-through only rout-through and function + 24 2 rDPU more NNports: rich Rout Resources select Nearest Neighbour (NN) Interconnect: an example 4 Examples of 2nd Level Interconnect: layouted over rDPU cell - no separate routing areas ! KressArray Family generic Fabrics: a few examples http://kressarray.de 17
SNN filter KressArray Mapping Example http://kressarray.de rout thru only array size: 10 x 16 = 160 rDPUs not used backbus connect 18
Coarse Grain rDPAs (Reconfigurable Computing): a fundamental Paradigm Shift converging design flows 1) systolic array* [1980] KressArray** [1995] and rDPAs2 terms: DPU: datpath unit DPA: data path array rDPU: reconfigurable DPU rDPA: reconfigurable DPA ____ *) hardwired **) reconfigurable 2) chip-on-a-day* [2000] [Broderson] It’s a General Paradigm Shift ! • Using FPGAs (fine grain reconfigurable): just Logic Synthesis on a strange platform • replaceConcurrent Processes by much more efficient parallelism: Stream-based DPAs1 Kress: a generalization of systolic array synthesis: super systolic synthesis 19
.... DPU DPU DPU DPU instruction sequencer instruction sequencer instruction sequencer instruction sequencer Bus(es) or switch box Concurrent Computing CPU extremely inefficient • control flow overhead • instruction fetch / interpretation overhead • address computation overhead - may be massive • massive bottleneck phenomena at run time 20
driven by data stream from/to memory or, from/to peripheral interface • no instruction sequencer inside ! DPU DPU DPU DPU DPU DPU DPU DPU DPU Stream-based Computing: (r)DPA • transport-triggered execution • for both, • reconfigurable, and • hardwired [Brodersen] avoids run time overhead and bottleneck phenomena rDPA: drastically reduced reconfigurability overhead 21
>> Programming rDPAs • Introduction • FPGA boom • Coarse Grain Architectures (rDPAs) • Programming rDPAs • Conclusions & Future Developments http://www.uni-kl.de 22
this dichotomy is completely ignored by our CS curricula y y a 1 DPU architecture + y 2 - * y 3 - - x equations placement - a a - a x 33 13 23 3 - a a a x 12 22 32 2 linear projection or algebraic mapping computing computing systolic a in space a a x in time 11 21 arrays 31 1 etc. - - ( ) y 0 - data streams 1 linear pipelines and uniform arrays only ( ) 0 y migration by re-timing 2 ( ) y 0 The Mathematician’s Synthesis Method 3 and other transformations Systolic Stream-based Computing System Systolic Array [H. T. Kung, 1980]: a DPA (Data Path Array) no routing! 23
y a + * DPU architectures x expression tree 1 3 2 simultaneous placement & routing + + 4 * xf Mapper - * sh sh + + * xf Scheduler data streams - * free form pipe network sh sh simulated annealing General Stream-based Computing System heterogenous DPA or rDPA The same mapper for both: Reconfigurable, or hardwired Kress DPSS [1995] 24
Xplorer intermediate form Datastream Generator DPSS Architecture & Mapping Editor HDL Generator Simulator Application Set Source Input KressArray (Design Space) Platform Space Explorer User Statistics Datapath Generator Generator Improvement Proposal Generator Delay & Power Estimator KressArray DPSS http://kressarray.de 25
Herz • Synthesizable Memory Communication Architecture • an example by Nageldinger’s KressArray Xplorer sequencers memory ports application Legend: not used Optimized Parallel Memory Controller GAG generic sequencer methodology vailable Memory Communication Architecture … • hot research topic in embedded systems • storage context transformations [Cathoor, Herz, Kougia, Soudris] http://kressarray.de • startups provide memory IP or generators 26
rDPA Compiler Memory (data memory) Scheduler memory bank memory bank memory bank ... memory bank ... Sequencers (data stream generator) memory bank ... for a Stream-based Soft Machine 27
University of Kaiserslautern Computer tightly coupled by compact instruction code loosely coupled by decision data bits only Xputer Compiler Compiler Memory Memory “von Neumann” Scheduler does not support soft data paths Sequencer Datapath Datapath Array multiple sequencer Datapath Xputer: har dw ired program d a ta reconfigurable reconfigurable The Soft Machine Paradigm cou n ter: cou n ter also for hardwired state register [Broderson] Computer:the wrong Machine Paradigm “von Neumann” Fundamentals available (course on Wednesday) 28
high level programming language source Software running on Partitioner KressArray Configware running on X-C Computer Machine Paradigm supporting different platforms GNU C compiler partitioning compiler compiler Analyzer Xputer “Soft” Machine Paradigm / Profiler mProcessor DPSS Resource Parameters interface Reconfigurable Accelerators Co-Compilation Hardware / Software Co-Design turns to Configware / Software Co-Design Jürgen Becker’sCo-DE-X Co-Compiler [ASP-DAC’95] X-C 29
sequential processes: resource parameter driven Co-Compilation host: loop 1-16 body endloop reconf.array: loop 1-8 trigger endloop fork loop 1-8 body body endloop loop 1-4 trigger endloop loop 9-16 body endloop loop 1-8 body endloop loop 1-2 trigger endloop join loop unrolling Loop Transformation Examples strip mining 30
>> Conclusions • Introduction • FPGA boom • Coarse Grain Architectures (rDPAs) • Programming rDPAs • Conclusions & Future developments http://www.uni-kl.de 31
HLL Compiler • Gray Research • Georgia Tech • Michigan State • Virginia Tech • New Mexico Tech • UC Riverside FPGA academic FPGA CPUs • UCSC: 1990! • Märaldalen University, Eskilstuna, Sweden • Chalmers University, Göteborg, Sweden • Cornell University • Hiroshima City University, Japan • Tokai University, Japan • Universidad de Valladolid, Spain • Washington University, St. Louis FPGA Memory core soft CPU FPGA CPUs 32
Compiler HLL miscellanous Memory soft CPU soft DPU array Soft rDPA ? • Rapid technology progress • 50 mio system gates soon • FPGAs f. relocateble configware code ? • Compatibility at configuration code level ? • Slower clock: compensated by more parellelism • Even large rDPAs as a soft IP become feasible • By >2005: don’t care about area efficiency ? 33
computing computing systolic in space in time arrays etc. • widely spread dichotomy and FPGA awareness Main problems to be solved Dominant FPGA vendor needs: • most software written for it • object code compatibility • most configware written for it • conf‘w. object code compatibility • widely accepted OS & tools Most successful µprocessor: • widely accepted „OS“ & tools • relocatable code important: • scalable memory FPGA-based de facto Standards: • scalable FPGA architectures supp‘n relocatable configuration code • configw. code compatibility by de facto standard RC platform family • de facto standard configware libraries Education: • compilers to avoid needing HDL-savvy users • curricular innovations are urgently needed 34
… is based on the Submarine Model Algorithm Software procedural high level Programming Language Brain usage: procedural-only Assembly Language Hardware invisible: under the surface Hardware However, current CS Education …. This model disables ... Software Faculty Colleagues shy away from the Paradigm Shift: 35 their Brain hurts? - can’t be: this Half has been amputated
procedural structural partitioning Brain Usage: both Hemispheres Hardware and Software as Alternatives Algorithm Hardw/Configw only Software only Software & Hardw/Configw Hardware, Configware Software 36
Hardware (procedural) structurally disabled … completely disabled to cope with solutions other than software only The Dominance of the Submarine Model ... ... indicates, that our CS education system produces zillions of mentally disabled Persons It‘s time to attack the software faculty dictatorship. Get involved! 37
>>> thank you thank you for listening 38
>>> END END 39