160 likes | 185 Views
Learn about the design, architecture, and benefits of an Erlang processor, dedicated for executing Erlang code efficiently, reducing power consumption, and simplifying embedded control applications. Explore results, instruction set architecture, prototype details, and future plans.
E N D
ECOMP - an Erlang Processor Robert Tjärnström, Ericsson Radio Peter Lundell, Ericsson Telecom
Outline • Why an Erlang Processor • The Architecture • Run-Time System • Prototype • Results
What Is an Erlang Processor • A (micro) processor dedicated for execution of Erlang. • Executes compiled Erlang code.
Why a Dedicated Erlang Processor • Increased use of Erlang • Eliminating Performance and Power Dissipations Concerns • Low Power Important in Embedded Control • Simplify use of Erlang for Embedded Control • Eliminate cost for Real-Time Operating System • Provide run-time functionality
Power Dissipation in Processors • Factors Increasing Power Dissipation • Increasing functionality • Less efficient code • Less efficient languages • Increasing speed requirements • Factors Decreasing Power Dissipation • Lower supply voltages • Scaled down mfg. processes • Increased level of integration
Instruction Set Architecture • Optimized for Execution of Erlang Code • Function calls, return from function • Argument transfer • list operations • Register file management • Clean register file upon start of new function • No read/write-back of variables needed
Tag Tag Tag Value Value Value Instruction Set Architecture • Supports processes • Supports local scope • Three sub-instructions in each machine instruction • Sub-instructions for garbage collections
Fetch Decode Reg-File Execute Data Program Mem unit Processor Architecture • Much in common with conven-tional architectures • RISC • LIW • Harvard • Pipelined (3-5 stages) • No complex (advanced) features • Not super-scalar • No OOO-execution or speculative execution • No branch-prediction (but will be added)
Processor Architecture • Real-time garbage collection • GC performed concurrently in HW • Currently supports one element size • HW supported process-switching (~20 cycles) • Currently 1 process-queue, (may have more) • Clock-cycle limit for each process • (Basic type checking) • (Prepared for Multi-Threading)
Run-Time Functionality • Switch, Spawn, Send, Message-queue handling, Catch/Throw, Time-out • External io, Atom-handling, Registered processes • Implemented in machine code • Built-ins (e.g., element) • Standard Libraries (e.g. lists, ETS)
Prototyping • HW model of the processor (developed in Erlang) • VHDL implementation & test bench • FPGA based demonstrator (VHDL-code)
Prototyping II • PCI Board with Xilinx 40150 FPGA and 4 banks of 2 MB SRAM each • Board has PCI bridge (slows down communication)
Prototyping III • Using a PC (NT) to host the board. • Board driver routines only available for Win NT • Messaging between Erlang-host to Erlang-board is accomplished thru a dynamically loadable driver (DLL). • 7 us / message on average • The external Erlang format is used for comm between board and host. • IO processes are running on both board and host.
Performance • About 3-4 lines of machine code per Erlang line • An approximate speed-up of a factor 30 can be seen • measured per use of clock cycles • Tested a larger example • Call Control. 16 KLOC. (714 k dump) • Increasing performance while decreasing power with more than order of a magnitude
Near Future Activities • Compiler Improvements • Product integration's • Distributed control node, e.g., multi-processor execution. • Full-Scale Version. • Multi-threaded Processor. • Prepare for Silicon Implementation.