560 likes | 797 Views
Advanced Processor Technologies group overview. APT mission. “To explore novel architectures and techniques that will enable the effective exploitation of the billion transistor chips of the near-future”. APT group. Focus: Moore’s Law will soon deliver billion transistor chips
E N D
APT mission “To explore novel architectures and techniques that will enable the effective exploitation of the billion transistor chips of the near-future”
APT group • Focus: • Moore’s Law will soon deliver billion transistor chips • how do we make best use of a billion transistors? • parallel processing • systems-on-chip • novel architectures • …?
Strategy/Vision • Industry shift to multicore processors • directly addressed by our CMP work • Power/heat is performance-limiting • asynchronous and low-power design have growing importance • Timing closure is a critical problem • acceptance of mixed timing and GALS • Design automation is vital • async automation must be competitive
Strategy/Vision • Can university groups design state-of-the-art digital silicon? • probably not in conventional processors • few academic groups still fab digital chips • Is trying to take designs through to fabrication still a good idea? • we believe so, because ‘reality’ matters! • but the game is very tough indeed
Many-core Architecture and Software Mikel Lujan
Buying a single-core processor is difficult! Multi-cores bring fundamental changes for Computer Science [applications, programming languages, compilers runtime systems (OS), computer architecture]
Active projects • Managed Runtime Environments and Low-Power Many-core Architectures • DOME Delaying and Overcoming Microprocessor Errors • Teraflux • On the search for a “good” parallel computational model • AXLE • Accelerating Analytics of Big Data
Managed Runtime Enviroments • Java, .Net are examples of managed runtime environments (JVM, CLR) • Key elements: JIT compilation and control of memory allocation • Research opportunities: • Scaling MREs for many-core architectures (GPUs) • Hardware acceleration of MREs • Use MREs for low-power computing • Use MREs for dealing with faults and transistor wearout -> DOME
TeraFlux Project • Major focus of current ‘General Purpose’ Many-Core research. • Three major goals • To define the hardware architecture of a highly extensible, general purpose multi-core system • To develop a simple to use parallel programming approach based on programming with • side-effect-free computations + transactions • How do we simulate/prototype many-cores architectures?
Starting Assumptions • Requiring strongly consistent shared memory is a major impediment to extensibility • The efficient scheduling of control-flow based threads is hard • The major complexity in parallel programming is the handling of shared state (locks etc.)
Simulate/Prototypemany-core architectures • Designing a chip is expensive and time consuming • Computer architects build software models to simulate new architectures • Simulation can be slow (months to run one application) • How we can accelerate this process? Research opportunities • New modelling techniques • FPGA prototyping
AXLE & Big Data • Collaboration with Dr. Gavin Brown (MLO group) • Amount of data generated in scientific experiments or social web keeps growing! • Graph-based data -> complex computation • How can we make sense of this data deluge? • New Learning techniques capable of working at scale • Redesign architectures (clusters/data centres) and software for low power analytics • Accelerate software (JIT adaptation) for data processing • Hardware acceleration for low-power learning algorithms
For more background info • "Future Multi-core Computing" (COMP6062b) • Learn by directed reading and group discussions of research papers • Practice parallel programming in the labs • Watch out for the organised ARM & Intel school seminars in Nov and Dec
CommunicationArchitectures Javier Navaridas
InterconnectionNetworks • On-chip networks • Tile-based systems • Heterogeneous systems • High performance computing networks • Massively Parallel Processing systems • Compute Clusters • Datacentres
Topics • Topologies • Routing • Wiring • Fault resilience • Deadlock avoidance • Router microarchitecture • Congestion control • Quality of Service • Fault tolerance • Scheduling and resource management • Task placement • System and workload modelling • Analytical modelling • Simulation
Alasdair Rawsthorne Virtualization
Unifying System and Process Virtualization Application Application Application • Potential benefits: performance, power, design time, security • Impacts design of future compilers, OS, CPU and runtimes alasdair.rawsthorne@manchester.ac.uk Operating System Operating System Dynamic Runtime Optimizing VMM Hypervisor/VMM Operating System CPU CPU CPU Application System Virtualization (eg Xen, Vmware, VirtualBox) Unified Virtualization Process Virtualization (eg JVM, Rosetta, DynamoRIO, ValGrind) Operating System CPU Unvirtualized
Neural Systems Engineering Steve Furber,Jim Garside,Dave Lester
Multi-core CPU node 18 ARM968 processors to model large-scale systems of spiking neurons in biological real time Scalable up to systems with 10,000s of nodes over a million processors >108 MIPS total The SpiNNaker project
Current status… • Full 18-core chip: arrived 20 May 2011 • Test card: 4 chips, 72 processors • Cards can be linked together • Neuron models: LIF, Izhikevich, MLP • Synapse models: STDP, NMDA • Networks: PyNN -> SpiNNaker, various small tools to build Router tables, etc • 48-chip 103 machine …and the next steps: • 500-chip 104 machine (Q4 2012), 5,000-chip 105 machine (H1 2013), 50,000-chip 106 machine (H2 2013).
PhD projects • Recent: • SpiNNaker monitoring • PyNN -> SpiNNaker • Real-time neural learning algorithms • Modelling the rat barrel cortex • Technology scaling on SpiNNaker • Error correction with CRC
Technology Scaling • 90nm SpiNNaker CPU node • SP library is faster • requires 128k DTCM • LL library better overall? • (work by Eustace Painkras, UoM PhD)
PyNN -> SpiNN • LIF • Izhikevich
PhD projects • Future: • System software • run-time fault-tolerance, scaling, … • SpiNNaker2 architecture exploration • Neural network models • learning algorithms, rewiring • Robotics using SpiNNaker • Non-neural algorithms • graphics, physics modelling, …
Emerging Technologies for Integrated Circuits and Systems Let’s do some hard(ware) work Vasilis Pavlidis www.cs.man.ac.uk/~pavlidiv
3-D Integration Opportunities • The same total area for the two circuits • RTSV = 170 mΩ, CTSV = 2 fF • *RCs for 65 nm, Del. Impr: 54% • Integrate disparate technologies/components 3-D global wire of 12 mm 2-D global wire of 20 mm * “ASU Predictive Technology Model.” [Online]. Available: http://www.eas.asu.edu/~ptm/
Three-Dimensional (3-D) Integrated Circuits and Systems • Develop design methodologies for 3-D ICs • New models are required to consider the third physical dimension • Diverse technologies • SiP, interposer, TSVs • Many challenges exist down the road!!! • Be the first to address them • Opportunities to tape-out do exist! • CMP/Tezzaron - cmp.imag.fr • Cadence PDK - 3-D Encounter Xilinx FPGA Virtex 7
A New Circuit Design Paradigm (Safe Projects ) • (Re-)Design and assess SpiNNaker-based 3-D architectures • Power, area, performance, cost/yield • Interposer and TSVs technologies • Research methodology • Use available resources • Differentiate only where required • Other topics • Can resonance improve energy efficiency of GALS based architectures? • Design for manufacturability for GALS systems 2-D/3-D • Considering process, voltage, and temperature (PVT) variations • PVT behavior is substantially different in 3-D systems • Develop/extend CAD tools for the physical design of 3-D systems • Special focus oninterposer technologies
3-D Integration as a System Integration Approach (High-Return Projects) • Heterogeneous 3-D integration • Preached a lot but not explored (at all)! • Memory on logic is a single application • Develop techniques and methods for “Mix-and-Match” systems • How do you model…? • How do evaluate…? • How do you integrate…? • How do you manufacture…? • The physical proximity of diverse systems may not come for free! • Interdisciplinary research is a prerequisite for such systems • Rather application driven
PhD Guidelines • PhD is NOT an end in itself but a means to end! • Persistence, Persistence, Persistence! • Manage rejection • Be there early! • Citations value more than publications • Presentation and writing skills
[Doug Edwards,] Jim Garside,Steve Furber, Alasdair Rawsthorne Asynchronous Logic Design Tools
Previous Projects • Balsa • world-leading public asynchronous synthesis tool • used for complete microprocessors • SEDATE • delay Insensitive datapath synthesis • GALSA • framework for heterogeneous GALS • ...
GAELS • Globally Asynchronous Elastic Logic Synthesis • modern SoCs comprise numerous, semi-autonomous subsystems • shrinking transistors have hard-to-predict variations • Address using Elastic Logic • new, delay tolerant paradigm • new project!
Jim Garside ReconfigurableProcessing
Current Computing • Energy use is a problem • Software • offers processing flexibility • highly inefficient – big overheads • Hardware • limited programmability • greater efficiency • expensive to develop
A Solution? • Compile an algorithm into a mixture of hardware and software • how to partition the 'code'? • dynamic adaptation • Existing solutions tend towards static partitioning • require wide skills from developers • sacrifice potential flexibility • intolerant of differing hardware
Dynamic Reconfiguration • Keep algorithm in common 'object' format • Identify, 'compile' and run repeating sections in available hardware • Adapt to facilities of any given chip • allow for future portability
To date ... • Can identify critical loops and recompile them to hardware • using pre-existing code • Developing tool flow • Have reasonable reconfigurable hardware architecture Results • Promising – not 'earth shattering'
Future • Want: • Means of expressing algorithms allowing easy compilation into software or hardware • Extract/exploit sensible parallelism • 'fine grain' for hardware • 'coarse grain' (?) for software • Get (some of) the available speed/power efficiency
Nick Filer with help from Barry Cheetham Mobile Systems Architecture
Nick Filer • Interests: • Wireless networks of all types. Mainly: • Ad-hoc, • Voice over IP, • Sensors (data collection) , • Pocket networks (e.g. mobile phones, PDAs), • Information dissemination. • Supported by: • Simulation, analysis, software generation tools. • eLearning tools for science.
Current Interest - 1 • Pocket Networks • Based on clusters of mobile users. • Person to person transport. • What applications are useful, will work, when and how will applications work? • Voice? • Video? • Delay tolerant text messages?
Current Interest - 2 • Low power Wireless Sensor Networks • Algorithms for reduced power usage, mainly getting it low by design. • Intelligent transport/routing protocols driving low power packet routing. • Smart dust: • Current cost $100+, needs to be cheaper. • Ultra-low power (NEW): processor, memory, design. • Nano scale. E.g. for use down oil wells!
Current Interest – 3 • Hand-over in mobile wireless networks. • Pretty much solved problem (even if not always ideal) for mobile phones. • Close to solutions for WiFi, WiMAX, Bluetooth, Zigbee etc. Still lots to learn though. • Currently 3 layer hierarchy – infrastructure Wide Area Personal Area. • What happens with more layers? • Macro scale to nano scale? • Fixed infrastructure interacting with mobile autonomous agents? • Just how inefficient are these mechanisms currently?
Current Interest - 4 • Information dissemination in mobile ad-hoc networks. • P2P technologies. • P2P optimization for task, availability, handover, low energy, access latency… • P2P to aid DNS like queries (information retrieval) in mobile, changing topology networks. • Delay tolerant P2P. Opportunistic communications e.g. send 100,000 sensors down an oil well, get 1 back, what does it know? Own data, others data?
Joint with Barry Cheetham Current Interest - 5 • Real time distributed systems (sound and video) • Internet choir • Very tight audio constraints (max 50ms) • Demands of latency & bandwidth • Singing together • Less constrained internet choir but synchronization very difficult. • Broadcast simulcasts • Mixed video and sound from various locations. • Broadcast over multiple media types with different delay etc. characteristics. • Major Obstacles: • Media types and standards, protocols, congestion, error handling, signal processing, links to hand-over problems ....
Current Interest - 6 • Support for adaptable network stacks • Writing or changing software is time consuming, error prone, … • Models can capture semantics of software: Purpose, usage, transformation knowledge ... • Hence: Use models to generate implementations. • Use in teaching/learning, simulation, network stack implementation. • Support for adaptable network stacks
Joint with Barry Cheetham Current Interest – 7 • eLearning for Complex Systems • Most eLearning tools you have seen are not much more Content Management Systems. • There is currently little or no evidence they improve student grades! • We have on-going work looking at improving understanding of wireless systems. • Also, interested in science teaching for awkward adolescents.