1 / 21

Multicore Applications Team

Multicore Applications Team. KeyStone C66x Multicore SoC Overview. KeyStone Overview. KeyStone Architecture CorePac & Memory Subsystem Internal Communications and Transport External Interfaces Coprocessors and Accelerators Debug Miscellaneous Application- and Device-specific.

wilton
Download Presentation

Multicore Applications Team

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multicore Applications Team KeyStone C66x Multicore SoC Overview

  2. KeyStone Overview KeyStone Architecture CorePac & Memory Subsystem Internal Communications and Transport External Interfaces Coprocessors and Accelerators Debug Miscellaneous Application- and Device-specific

  3. Enhanced DSP Core Performance improvement 100% upward object code compatible 4x performance improvement for multiply operation 32 16-bit MACs Improved support for complex arithmetic and matrixcomputation C66x ISA 100% upward object code compatible with C64x, C64x+, C67x and c67x+ Best of fixed-point and floating-point architecture for better system performance and faster time-to-market. 2x registers Enhanced floating-point add capabilities SPLOOP and 16-bit instructions for smaller code size Flexible level one memory architecture iDMA for rapid data transfers between local memories C64x+ C674x C67x+ Native instructions for IEEE 754, SP&DP Advanced VLIW architecture C67x Advanced fixed-point instructions Four 16-bit or eight 8-bit MACs Two-level cache C64x FLOATING-POINT VALUE FIXED-POINT VALUE Preliminary Information under NDA - subject to change

  4. KeyStone Device Architecture Application-Specific Memory Subsystem Coprocessors C66x™ CorePac Miscellaneous TeraNet HyperLink Multicore Navigator External Interfaces Network Coprocessor

  5. CorePac 1 to 8 C66x CorePac DSP Cores operating at up to 1.25 GHz Fixed- and floating-point operations Code compatible with other C64x+ and C67x+ devices L1 Memory Can be partitioned as cache and/or RAM 32KB L1P per core 32KB L1D per core Error detection for L1P Memory protection Dedicated L2 Memory Can be partitioned as cache and/or RAM 512 KB to 1 MB Local L2 per core Error detection and correction for all L2 memory Direct connection to memory subsystem Application-Specific Memory Subsystem Coprocessors C66x™ CorePac L1D L1P Cache/RAM Cache/RAM L2 Memory Cache/RAM 1 to 8 Cores @ up to 1.25 GHz Miscellaneous TeraNet HyperLink Multicore Navigator External Interfaces Network Coprocessor

  6. Memory Subsystem • Multicore Shared Memory (MSM SRAM) • 1 to 4 MB • Available to all cores • Can contain program and data • All devices except C6654 • Multicore Shared Memory Controller (MSMC) • Arbitrates access of CorePac and SoCmasters to shared memory • Provides a connection to the DDR3 EMIF • Provides CorePac access to coprocessors and IO peripherals • Provides error detection and correction for all shared memory • Memory protection and address extension to 64 GB (36 bits) • Provides multi-stream pre-fetching capability • DDR3 External Memory Interface (EMIF) • Support for 16-bit, 32-bit, and (for C667x devices) 64-bit modes • Specified at up to 1600 MT/s • Supports power down of unused pins when using 16-bit or 32-bit width • Support for 8 GB memory address • Error detection and correction Application-Specific Memory Subsystem Coprocessors MSM DDR3 EMIF SRAM MSMC C66x™ CorePac L1D L1P Cache/RAM Cache/RAM L2 Memory Cache/RAM 1 to 8 Cores @ up to 1.25 GHz Miscellaneous TeraNet HyperLink Multicore Navigator External Interfaces Network Coprocessor

  7. Multicore Navigator • Provides seamless inter-core communications (messages and data exchanges) between cores, IP, and peripherals. “Fire and forget” • Low-overhead processing and routing of packet traffic to and from peripherals and cores • Supports dynamic load optimization • Data transfer architecture designed to minimize host interaction while maximizing memory and bus efficiency • Consists of a Queue Manager Subsystem (QMSS) and multiple, dedicated Packet DMA engines Application-Specific Memory Subsystem Coprocessors MSM DDR3 EMIF SRAM MSMC C66x™ CorePac L1D L1P Cache/RAM Cache/RAM L2 Memory Cache/RAM 1 to 8 Cores @ up to 1.25 GHz Miscellaneous TeraNet HyperLink Multicore Navigator Queue Packet Manager DMA External Interfaces Network Coprocessor

  8. Multicore Navigator Architecture

  9. Network Coprocessor (C667x) • Provides hardware accelerators to perform L2, L3, and L4 processing and encryption that was previously done in software • Packet Accelerator (PA) • 8K multiple-in, multiple-out HW queues • Single IP address option • UDP (and TCP) checksum and selected CRCs • L2/L3/L4 support • Quality of Service (QoS) • Multicast to multiple queues • Timestamps • Security Accelerator (SA) • Hardware encryption, decryption, and authentication • Supports IPsec ESP, IPsec AH, SRTP, and 3GPP protocols Application-Specific Memory Subsystem Coprocessors MSM DDR3 EMIF SRAM MSMC C66x™ CorePac L1D L1P Cache/RAM Cache/RAM L2 Memory Cache/RAM 1 to 8 Cores @ up to 1.25 GHz Miscellaneous TeraNet HyperLink Multicore Navigator Queue Packet Manager DMA t e h n c Security r h t i e External Interfaces c w Accelerator h t i t S w E S Packet Accelerator I I M x2 G S Network Coprocessor

  10. External Interfaces 2x SGMII ports support 10/100/1000 Ethernet 4x high-bandwidthSerial RapidIO (SRIO) lanes for inter-DSP applications SPI for boot operations UART for development/testing 2x PCIe at 5 Gbps I2C for EPROM at 400 Kbps GPIO Device-specific Interfaces Wireless Applications General Purpose Applications Application-Specific Memory Subsystem Coprocessors MSM DDR3 EMIF SRAM MSMC C66x™ CorePac L1D L1P Cache/RAM Cache/RAM L2 Memory Cache/RAM 1 to 8 Cores @ up to 1.25 GHz Miscellaneous TeraNet HyperLink Multicore Navigator Queue Packet Manager DMA t x4 x2 e h T n I c R Security C r P h t e O DeviceSpecific I/O i e DeviceSpecific I/O A c GPIO S I 2 I w Accelerator I h t C U R i t S P w E S S Packet Accelerator I I M x2 G S Network Coprocessor

  11. TeraNet Switch Fabric • A non-blocking switch fabric that enables fast and contention-free internal data movement • Provides a configured way – within hardware – to manage traffic queues and ensure priority jobs are getting accomplished while minimizing the involvement of the CorePac cores • Facilitates high-bandwidth communications between CorePac cores, subsystems, peripherals, and memory Application-Specific Memory Subsystem Coprocessors MSM DDR3 EMIF SRAM MSMC C66x™ CorePac L1D L1P Cache/RAM Cache/RAM L2 Memory Cache/RAM 1 to 8 Cores @ up to 1.25 GHz Miscellaneous TeraNet HyperLink Multicore Navigator Queue Packet Manager DMA t x4 x2 e h T n I c R Security C r P h t e O DeviceSpecific I/O i e DeviceSpecific I/O A c GPIO S I 2 I w Accelerator I h t C U R i t S P w E S S Packet Accelerator I I M x2 G S Network Coprocessor

  12. TeraNet Data Connections TC7 TC4 TC3 TC9 TC2 TC5 TC8 TC6 TC1 TC0 M M M M M M M M M M DebugSS M HyperLink S MSMC DDR3 S M CPUCLK/2 256bit TeraNet S Shared L2 DDR3 HyperLink M M S S S S TPCC 16ch QDMA EDMA_0 • Facilitates high-bandwidth communication links between DSP cores, subsystems, peripherals, and memories. • Supports parallel orthogonal communication links XMC S L2 0-3 M S Core M SRIO M S Core M S Core M M Network Coprocessor M SRIO S TPCC 64ch QDMA S TCP3e_W/R TPCC 64ch QDMA S TCP3d EDMA_1,2 S TCP3d CPUCLK/3 128bit TeraNet S TAC_BE TAC_FE M S RAC_FE RAC_BE0,1 M RAC_FE S RAC_BE0,1 M FFTC / PktDMA M FFTC / PktDMA M S VCP2 (x4) VCP2 (x4) S S VCP2 (x4) AIF / PktDMA M S VCP2 (x4) QMSS M QMSS S PCIe M S PCIe

  13. Diagnostic Enhancements • Embedded Trace Buffers (ETB) enhance the diagnostic capabilities of the CorePac. • CP Monitor enables diagnostic capabilities on data traffic through the TeraNet switch fabric. • Automatic statistics collection and exporting (non-intrusive) • Monitor individual events for better debugging • Monitor transactions to both memory end point and Memory-Mapped Registers (MMR) • Configurable monitor filtering capability based on address and transaction type Application-Specific Memory Subsystem Coprocessors MSM DDR3 EMIF SRAM MSMC Debug/Trace C66x™ CorePac L1D L1P Cache/RAM Cache/RAM L2 Memory Cache/RAM 1 to 8 Cores @ up to 1.25 GHz Miscellaneous TeraNet HyperLink Multicore Navigator Queue Packet Manager DMA t x4 x2 e h T n I c R Security C r P h t e O DeviceSpecific I/O i e DeviceSpecific I/O A c GPIO S I 2 I w Accelerator I h t C U R i t S P w E S S Packet Accelerator I I M x2 G S Network Coprocessor

  14. HyperLink Bus • Provides the capability to expand the device to include hardware acceleration or other auxiliary processors • Supports four lanes with up to 12.5 Gbaud per lane Application-Specific Memory Subsystem Coprocessors MSM DDR3 EMIF SRAM MSMC Debug/Trace C66x™ CorePac L1D L1P Cache/RAM Cache/RAM L2 Memory Cache/RAM 1 to 8 Cores @ up to 1.25 GHz Miscellaneous TeraNet HyperLink Multicore Navigator Queue Packet Manager DMA t x4 x2 e h T n I c R Security C r P h t e O DeviceSpecific I/O i e DeviceSpecific I/O A c GPIO S I 2 I w Accelerator I h t C U R i t S P w E S S Packet Accelerator I I M x2 G S Network Coprocessor

  15. Miscellaneous Elements • Boot ROM • Semaphore module provides atomic access to shared chip-level resources. • Power Management • Three on-chip PLLs: • PLL1 for CorePacs, except • PLL2 for DDR3 • PLL3 for Packet Acceleration • Three EDMA controllers • Eight 64-bit timers • Inter-Processor Communication (IPC) Registers Application-Specific Memory Subsystem Coprocessors MSM DDR3 EMIF SRAM MSMC Debug/Trace Boot ROM Semaphore C66x™ CorePac Power Management PLL L1D L1P Cache/RAM Cache/RAM x3 L2 Memory Cache/RAM EDMA 1 to 8 Cores @ up to 1.25 GHz x3 TeraNet HyperLink Multicore Navigator Queue Packet Manager DMA t x4 x2 e h T n I c R Security C r P h t e O DeviceSpecific I/O i e DeviceSpecific I/O A c GPIO S I 2 I w Accelerator I h t C U R i t S P w E S S Packet Accelerator I I M x2 G S Network Coprocessor

  16. Device-Specific: C6670 for Wireless Apps C6670 Memory Subsystem Coprocessors 2MB 64-Bit MSM DDR3 EMIF SRAM MSMC RSA RSA x2 VCP2 x4 Boot ROM Semaphore TCP3d x2 Power TCP3e Management PLL 32KB L1P 32KB L1D FFTC x2 Cache/RAM Cache/RAM x3 1024KB L2 Cache/RAM EDMA 4 Cores @ 1.0 GHz / 1.2 GHz x3 TeraNet HyperLink Multicore Navigator Queue Packet Manager DMA t x4 x2 x6 e h T n I c R C r P t h O e i 2 e A 2 c I S I w I F h t C U R i I t S P A w E S S I I M G S Network Coprocessor • Device-specific Coprocessors: • 2x FFT Coprocessor (FFTC) • Turbo Decoder/Encoder Coprocessor (TCP3d/3e) • 4x Viterbi Coprocessor (VCP2) • Bit-rate Coprocessor (BCP) • 2x Rake Search Accelerator (RSA) • Device-specific Interfaces: • 6x Antenna Interface 2 (AIF2) Debug/Trace C66x™ CorePac BCP Security GPIO Accelerator Packet Accelerator x2

  17. Device-Specific: C667x General Purpose C6671/C6672C6674/C6678 4MB 64-Bit MSM DDR3 EMIF SRAM MSMC Boot ROM Semaphore Power Management PLL x3 512KB L2 Cache/RAM EDMA 1 to 8 Cores @ up to 1.25 GHz x3 TeraNet TeraNet Multicore Navigator Queue Packet Manager DMA t x4 x2 x2 6 e h T 1 O n I c R C r I P t h F O e P i P e I A 2 c I S I I w I h t M G C U S R i t S E P w T E S S I I M x2 G S • Device-specific Interfaces: • 2x Telecommunications Serial Port (TSIP) • Asynchronous Memory Interface (EMIF16): • Connects memory up to 256 MB • Three modes: • Synchronized SRAM • NAND flash • NOR flash Memory Subsystem Debug/Trace C66x™ CorePac 32KB L1P 32KB L1D Cache/RAM Cache/RAM HyperLink Security Accelerator Packet Accelerator Network Coprocessor

  18. Device-Specific: C665x General Purpose • Device-specific Coprocessors: • Turbo Decoder Coprocessor (TCP3d) • 2x Viterbi Coprocessor (VCP2) • Device-specific Interfaces: • Asynchronous Memory Interface (EMIF16) • Universal Parallel Port (UPP) • 2x Multichannel Buffered Serial Ports (McBSP) • Device-specific Memory: • 1 MB Multicore Shared Memory (MSM SRAM) • 32-bit DDR3 Interface C6655/57 Memory Subsystem 1MB MSM 32-Bit SRAM DDR3 EMIF MSMC Debug/Trace Boot ROM 2nd core, C6657 only Semaphore C66x™ Timers CorePac Security / Key Manager Coprocessors Power Management 32KB L1P 32KB L1D TCP3d Cache/RAM Cache/RAM PLL x2 1024KB L2 Cache VCP2 x2 EDMA 1 or 2 Cores @ up to 1.25 GHz TeraNet HyperLink Multicore Navigator Queue Packet Manager DMA x2 x2 x2 x4 Ethernet O P T I I P O MAC P P R e I2C EMIF16 I I U McBSP S G A C R U S P SGMII

  19. Device-Specific: C665x Power Optimized • Device-specific Interfaces: • Asynchronous Memory Interface (EMIF16) • Universal Parallel Port (UPP) • 2x Multichannel Buffered Serial Ports (McBSP) • Device-specific Memory: • 32-bit DDR3 Interface C6654 Memory Subsystem 32-Bit MSMC DDR3 EMIF Debug/Trace Boot ROM Semaphore C66x™ Timers CorePac Security / Key Manager Power Management 32KB L1P 32KB L1D Cache/RAM Cache/RAM PLL x2 1024KB L2 Cache EDMA 1 Core @ 850 MHz TeraNet Multicore Navigator Queue Packet Manager DMA x2 x2 x2 Ethernet O P P T I I P S MAC P P R e I2C EMIF16 I U B S G A C c U P M SGMII

  20. KeyStone C665x: Key HW Variations

  21. For More Information • For more information, refer to theC66x Getting Started page to locate the data manual for your KeyStone device. • View the complete C66x Multicore SOC Online Training for KeyStone Devices, including details on the individual modules. • For questions regarding topics covered in this training, visit the support forums at theTI E2E Community website.

More Related