200 likes | 411 Views
E N D
2. The Technology Catalyst
4. Parallel Programming Challenge Ease of Use and Flexibility We now must look at the applications supporting HPC and ensure they are taking advantage of the technology designed into Nehalem. Is the code parallelized?
Is it optimized on NHM? For many years applications have been able to take advantage of the increased frequency to improve performance. Now we are offering more cores to gain performance. ISV’s are now taking their serial code and parallelizing it. This is a challenge Intel is trying to make as simple as possible.We now must look at the applications supporting HPC and ensure they are taking advantage of the technology designed into Nehalem. Is the code parallelized?
Is it optimized on NHM? For many years applications have been able to take advantage of the increased frequency to improve performance. Now we are offering more cores to gain performance. ISV’s are now taking their serial code and parallelizing it. This is a challenge Intel is trying to make as simple as possible.
5. The Technical Computing Architecture Communicate how HPC and workstations work together.
Technical computing is a combination of workstations and High performance computing clusters. The technical computing industry is driven to deliver results …fast. Workstations are required to create and HPC clusters are needed to simulate and analyze. After you analyze the data you can visualize the results to enable faster innovation and discovery Communicate how HPC and workstations work together.
Technical computing is a combination of workstations and High performance computing clusters. The technical computing industry is driven to deliver results …fast. Workstations are required to create and HPC clusters are needed to simulate and analyze. After you analyze the data you can visualize the results to enable faster innovation and discovery
6. Insatiable Demand for Performance, Density, and Efficiency The insatiable desire for performance will continue through the foreseeable future. The graph on the left takes the #1 systems on the Top500 and projects the future performance through 2029. From weather modeling to understanding how detergent flows through a dishwasher the need for compute performance isn’t going away any time soon.
To deliver the added performance we need to be aware of the required power. Intel strives to continually deliver more performance at similar to lower power requirements of previous generation platforms. To meet the performance needs in 2029, Intel will continue to explore ways to deliver greater performance at similar or reduced power envelopes of today.The insatiable desire for performance will continue through the foreseeable future. The graph on the left takes the #1 systems on the Top500 and projects the future performance through 2029. From weather modeling to understanding how detergent flows through a dishwasher the need for compute performance isn’t going away any time soon.
To deliver the added performance we need to be aware of the required power. Intel strives to continually deliver more performance at similar to lower power requirements of previous generation platforms. To meet the performance needs in 2029, Intel will continue to explore ways to deliver greater performance at similar or reduced power envelopes of today.
7. Data Center Convergence Speaker notes
Datacenters continue to experience new demands from their customers: scalability on demand, pressure to deliver “IT as a service”, & lowest possible TCO
This creates challenges in manageability, security, flexibility, & affordability. So IT managers are looking for simplification.
Fortunately a major technology transition is underway which will simplify IT managers’ lives: the convergence of Compute, Networking, & Storage.
Compute: Intel’s heritage is compute performance, but with the Nehalem uArch, we brought intelligence to adapt to workloads w/ dynamic features like Turbo.
Storage: Standard building blocks are transforming the Storage market – this puts more emphasis on responsiveness of the storage compute engine.
In storage, we expect Intel architecture to drive 7 out of every 10 external storage systems shipped by end of 2010
Industry leaders like EMC have chosen the Xeon architecture of choice for storage solutions w/ their recent announcement of EMC Symmetrix solutions.
Networking: The industry continues to drive bandwidth up and latency down via ethernet & the world is moving to converged fabrics.
Many people don’t know that Intel is the world’s leading supplier of LAN connections – we shipped over a half billion connections over past 10 years.
Intel is committed to lead the transition to converged networks w/ our leadership LAN products & technologies.
Close: So Xeon is the cornerstone of next-generation Intelligent Data Centers.
Speaker notes
Datacenters continue to experience new demands from their customers: scalability on demand, pressure to deliver “IT as a service”, & lowest possible TCO
This creates challenges in manageability, security, flexibility, & affordability. So IT managers are looking for simplification.
Fortunately a major technology transition is underway which will simplify IT managers’ lives: the convergence of Compute, Networking, & Storage.
Compute: Intel’s heritage is compute performance, but with the Nehalem uArch, we brought intelligence to adapt to workloads w/ dynamic features like Turbo.
Storage: Standard building blocks are transforming the Storage market – this puts more emphasis on responsiveness of the storage compute engine.
In storage, we expect Intel architecture to drive 7 out of every 10 external storage systems shipped by end of 2010
Industry leaders like EMC have chosen the Xeon architecture of choice for storage solutions w/ their recent announcement of EMC Symmetrix solutions.
Networking: The industry continues to drive bandwidth up and latency down via ethernet & the world is moving to converged fabrics.
Many people don’t know that Intel is the world’s leading supplier of LAN connections – we shipped over a half billion connections over past 10 years.
Intel is committed to lead the transition to converged networks w/ our leadership LAN products & technologies.
Close: So Xeon is the cornerstone of next-generation Intelligent Data Centers.
9. Intel Processor Product Launch Roadmap Principle:
Baseline - Sustaining $10-12M
Media to figure out how to use the rest with Burst
Content deal funded by BTL
Launch Bursts - Leverage WSJ Opinion Leader; JMP; BTL
50% in SEM and Contextual for hot topics, imperatives, Ravi projects
25% on client burst
25% on server burst
Cloud and Security via Global Engagement
Topic: Cloud Mar-client aware, Security (Mcafee) Principle:
Baseline - Sustaining $10-12M
Media to figure out how to use the rest with Burst
Content deal funded by BTL
Launch Bursts - Leverage WSJ Opinion Leader; JMP; BTL
50% in SEM and Contextual for hot topics, imperatives, Ravi projects
25% on client burst
25% on server burst
Cloud and Security via Global Engagement
Topic: Cloud Mar-client aware, Security (Mcafee)
12. Sandy Bridge Server Platform SummaryNew micro-architecture on the 32nm process technology 1 Lower platform power claim based on a Xeon® 5600 CPU and Sandy Bridge-EP CPU with the same TDP specification and comparable platform configurations. Platform power reduction is primarily attributed to TDP reduction from a two-chip solution based on the Intel 5520 chip set and ICH-10R, down to a one-chip south bridge solution(Patsburg chip) on the Sandy Bridge platform.
13. Xeon® E5 Platform Roadmap 13 *For a full list of technologies, see WW45 NDA Data Center Group Roadmap on SMCR.Intel.com
14. Xeon® 2S Platform Comparison
15. Romley EP (Socket R) vs. Romley EN (Socket B2)
16. Intel® Advanced Vector Extensions (Intel® AVX) Extension to 128-bit SSE Instruction
Support for 256-bit wide vectors and SIMD register set
Targets floating point operations
Benefits these applications:
Engineering
Visual processing/recognition
Data-mining
Physics,
Cryptography
VADDPS instruction allowing you to align data, Ymm1, ymm2 are avx registers.
Streaming SIMD Extensions (SSE)
SIMD Single-Instruction Stream Multiple-Data
Legacy SSE was 128 bit, the new AVX instructions have been widened to 256 bit and targeted at Floating Point intensive Operations and can double in performance.
Improves performance via wider vectors
This results in better management of data and general purpose applications like image, audio/video processing, scientific simulations, financial analytics and 3D modeling and analysis.
Non destructive source – had to do a register copy before; less code; makes it easier for the compiler for vectorization and optimization. Needs Linux 2.6.30 or later and Windows7 SP1 or later, Win2k8 SP1 or later for AVX
Compile with the right switch or re write the assembly or intrinsics
XMM registersVADDPS instruction allowing you to align data, Ymm1, ymm2 are avx registers.
Streaming SIMD Extensions (SSE)
SIMD Single-Instruction Stream Multiple-Data
Legacy SSE was 128 bit, the new AVX instructions have been widened to 256 bit and targeted at Floating Point intensive Operations and can double in performance.
Improves performance via wider vectors
This results in better management of data and general purpose applications like image, audio/video processing, scientific simulations, financial analytics and 3D modeling and analysis.
Non destructive source – had to do a register copy before; less code; makes it easier for the compiler for vectorization and optimization. Needs Linux 2.6.30 or later and Windows7 SP1 or later, Win2k8 SP1 or later for AVX
Compile with the right switch or re write the assembly or intrinsics
XMM registers
17. Intel Technology is Changing HPCTCO, Performance, Reliability SSD’s
Extreme Performance >100x IOPS€ performance gains vs. 15k HDD
Power Efficient - >5x lower power€ vs. 15k HDD
Increased Reliability - 2.0M Hrs MTBF vs, 1.20M Hrs MTBF for 7.2K WD RE2
Reduce system cost - Replace HDD and Memory with SSD’s
10GbE
Extreme Performance - iWARP provides low latency over 10GbE Low overhead and high bandwidth
Increased Reliability - Over 25 years delivering leading Ethernet products Broad OS Support Designed for Multi-core
Power Efficient - Low power design <3.5W
Lower TCO Consolidated fabric through industry standardized technology
SSD’s
Extreme Performance >100x IOPS€ performance gains vs. 15k HDD
Power Efficient - >5x lower power€ vs. 15k HDD
Increased Reliability - 2.0M Hrs MTBF vs, 1.20M Hrs MTBF for 7.2K WD RE2
Reduce system cost - Replace HDD and Memory with SSD’s
10GbE
Extreme Performance - iWARP provides low latency over 10GbE Low overhead and high bandwidth
Increased Reliability - Over 25 years delivering leading Ethernet products Broad OS Support Designed for Multi-core
Power Efficient - Low power design <3.5W
Lower TCO Consolidated fabric through industry standardized technology
18. Scaling Performance Forward One Development Environment – Multi- to Many-core Debug and Tune become equally important to carry forward to many-core. This is the heterogeneous tool set now, as many-core applications scale to terascale on clients, and these terascale nodes make clusters of petascale machines.
Better performance, multi-core advancements and support for Intel® Core™ i7 processors. New versions of SW tools released in Nov. 08.
the first step in the cycle is to gain insight into your code by analyzing it with tools such as Vtune performance analyzer and/or Thread Checker
Next, you parallelize your code with Intel tools such as Intel® Threading Blocks, Compilers, and Performance Libraries
After you parallelize your code you review the resutls for correctness/confidence. If you do not achieve the results you expect you can begin the cycle again with insight. Once you have achieved the desired results you and then performa a final optimization to ensure peak performance with Intel® VTune Performance Analyzer and Thread Profiler. Debug and Tune become equally important to carry forward to many-core. This is the heterogeneous tool set now, as many-core applications scale to terascale on clients, and these terascale nodes make clusters of petascale machines.
Better performance, multi-core advancements and support for Intel® Core™ i7 processors. New versions of SW tools released in Nov. 08.
the first step in the cycle is to gain insight into your code by analyzing it with tools such as Vtune performance analyzer and/or Thread Checker
Next, you parallelize your code with Intel tools such as Intel® Threading Blocks, Compilers, and Performance Libraries
After you parallelize your code you review the resutls for correctness/confidence. If you do not achieve the results you expect you can begin the cycle again with insight. Once you have achieved the desired results you and then performa a final optimization to ensure peak performance with Intel® VTune Performance Analyzer and Thread Profiler.
19. Solving Your HPC Challenges Intelligent performance helping to deliver a lower TCO as well as ~3x the performance of previous generation processors.
Intel Software tools enable users to easily optimize their software to maximize performance on current and future generation IA hardware
Intel Cluster Ready makes deploying a cluster easy
Intelligent performance helping to deliver a lower TCO as well as ~3x the performance of previous generation processors.
Intel Software tools enable users to easily optimize their software to maximize performance on current and future generation IA hardware
Intel Cluster Ready makes deploying a cluster easy