270 likes | 281 Views
CS 3501 - Chapter 9. Dr. Clincy Professor of CS. We cover Lecture 21 Irma Online Lecture Fragment (Lecture 22) was posted on this past Tuesday
E N D
CS 3501 - Chapter 9 Dr. Clincy Professor of CS • We cover Lecture 21 • Irma Online Lecture Fragment (Lecture 22) was posted on this past Tuesday • Exam 4 on Tuesday (Nov 14th) covering Ch 4, Sections 5.1 and 5.2, only the topics I will cover in Ch9, and only the topics covered in the slides regarding hardware security – opened book Dr. Clincy Lecture Slide 1
Chapter 9 We have so far studied only the simplest models of computer systems; classical single-processor von Neumann systems. This chapter presents a number of different approaches to computer organization and architecture. Some of these approaches are in place in today’s commercial systems. Others may form the basis for the computers of the future. 2
ARCHITECTURE APPROACHES • Complex Instruction Set Computer (CISC) • Large number of instructions, of variable length • Instructions have complex layouts • A single instruction can be complex and perform multiple operations • ISSUE: a small subset of CISC instructions slows the system down • EXAMPLE: Intel x86 Architectures • Reduced Instruction Set Computer (RISC) • Because of CISC’s issue, designers return to less complicated architecture and hardwired a small, but complete instruction set • Hardwired instructions are much faster • Compiler is responsible for producing efficient code for Instruction Set Architecture (ISA) • Simple instructions that execute faster • Each instruction only performs one operation • All instructions are the SAME size • EXAMPLE: Pentium family and MIPS family of CPUs (Microprocessor without Interlocked Pipeline Stages)
RISC vs CISC Analogy ENGLISH Today is your birthday 19symbols CHINESE 今天是你的生日 7 symbols
RISC Vs CISC Machines The underlying philosophy of RISC machines is that a system is better able to manage program execution when the program consists of only a few different instructions that are the same length and require the same number of clock cycles to decode and execute. RISC systems access memory only with explicit load and store instructions. In CISC systems, many different kinds of instructions access memory, making instruction length variable and fetch-decode-execute time unpredictable. 5
RISC Vs CISC Machines The difference between CISC and RISC becomes evident through the basic computer performance equation: RISC systems shorten execution time by reducing the clock cycles per instruction. CISC systems improve performance by reducing the number of instructions per program. 6
RISC Vs CISC Machines The simple instruction set of RISC machines enables control units to be hardwired for maximum speed. RISC heavily depends on the compiler to generate efficient machine code With fixed-length instructions, RISC lends itself to pipelining and speculative execution (more predictive). The more complex and variable length instruction set of CISC machines requires microcode-based control units that interpret instructions as they are fetched from memory. This translation takes time. Execution is less predictive. 7
RISC Vs CISC Machines Consider the program fragments: The total clock cycles for the CISC version might be: (2 movs 1 cycle) + (1 mul 30 cycles) = 32 cycles While the clock cycles for the RISC version is: (3 movs 1 cycle) + (5 adds 1 cycle) + (5 loops 1 cycle) = 13 cycles With RISC clock cycle being shorter, RISC gives us much faster execution speeds. mov ax, 0 mov bx, 10 mov cx, 5 Begin add ax, bx loop Begin mov ax, 10 mov bx, 5 mul bx, ax RISC CISC 8
RISC Machines By reducing instruction complexity, simpler chips are needed – as a result, transistors heavily used by CISC can be used in more innovative ways like: pipelines, cache and registers CISC procedure calls and parameter passing involves considerable effort and resources Involves saving a return address Preserving register values Passing parameters by pushing on a stack or using registers Branching to the subroutine Executing the subroutine Once it returns to the calling program after the subroutine execution, parameter value modifications are saved Previous register values are restored Freeing up registers from the above tasks provide RISC machines hundreds of registers to create register environments (less shared registers – more dedicated registers) Because of their load-store ISAs, RISC architectures require a large number of CPU registers. These register provide fast access to data during sequential program execution. They can also be employed to reduce the overhead typically caused by passing parameters to subprograms. Instead of pulling parameters off of a stack, the subprogram is directed to use a subset of registers. 9
RISC Multiple register sets. Three operands per instruction. Parameter passing through register windows. Single-cycle instructions. Hardwired control. Highly pipelined. CISC Single register set. One or two register operands per instruction. Parameter passing through memory. Multiple cycle instructions. Microprogrammed control. Less pipelined. RISC Vs CISC Continued.... 10
RISC Simple instructions, few in number. Fixed length instructions. Complexity in compiler. Only LOAD/STORE instructions access memory. Few addressing modes. CISC Many complex instructions. Variable length instructions. Complexity in microcode. Many instructions can access memory. Many addressing modes. RISC Vs CISC 11
RISC Vs CISC Machines Today It is becoming increasingly difficult to distinguish RISC architectures from CISC architectures today. Some RISC systems provide more extravagant instruction sets than some CISC systems. Some systems combine both approaches. With the rise of embedded systems and mobile computing, the terms RISC and CISC have lost their significance The RISC Vs CISC debate started when chip-area and processor design complexity were issues – now energy and power are the issues. The two top competitors today are ARM and Intel – Intel focuses on performance and ARM focuses on efficiency (British company – Advanced RISC Machine) 12
Flynn’s Taxonomy Many attempts have been made to come up with a way to categorize computer architectures. Flynn’s Taxonomy has been the most enduring of these, despite having some limitations. Flynn’s Taxonomy takes into consideration the number of processors and the number of data streams that flow into the processor. A machine can have one or many processors that operate on one or many data streams. For Flynn’s Taxonomy, the architecture is driven or influenced by the instructions characteristics – all processor activities are determined by a sequence of program code – program code act on the data. Data driven or Dataflow type architectures sequence of processor events are based on the data characteristics and not the instructions’ characteristics 13
Flynn’s Taxonomy The four combinations of multiple processors and multiple data paths are described by Flynn as: SISD: Single instruction stream, single data stream. These are classic uniprocessor systems. SIMD:Single instruction stream, multiple data streams. Executes a single instruction using multiple computations at the same time (or in parallel) – makes use of data-level parallelism. All processors execute the same instruction simultaneously. MIMD:Multiple instruction streams, multiple data streams. These are today’s parallel architectures. Multiple processors function independently and asynchronously in executing different instructions on different data MISD:Multiple instruction streams operating on a single data stream. Many processors performing different operations on the same data stream. 15
Flynn’s Taxonomy Flynn’s Taxonomy falls short in a number of ways: First, there appears to be no need for MISD machines. Second, parallelism is not homogeneous. This assumption ignores the contribution of specialized processors. Third, it provides no straightforward way to distinguish architectures of the MIMD category. Doesn’t take into consideration how the processors are connected or interface with memory. One idea is to divide these MIMD systems into those that share memory, and those that don’t (distributed memory), as well as whether the interconnections are bus-based or switch-based. 16
Flynn’s Taxonomy - MIMD Symmetric multiprocessors (SMP) and massively parallel processors (MPP) are MIMD architectures that differ in how they use memory. SMP systems share the same memory and MPP do not. An easy way to distinguish SMP from MPP is: MPP many processors + distributed memory + communication via network SMP fewer processors + shared memory + communication via memory 17
Flynn’s Taxonomy - MIMD Other examples of MIMD architectures are found in distributed computing, where processing takes place collaboratively among networked computers. A network of workstations (NOW) uses otherwise idle systems to solve a problem. A collection of workstations (COW) is a NOW where one workstation coordinates the actions of the others. A dedicated cluster parallel computer (DCPC) is a group of workstations brought together to solve a specific problem. A pile of PCs (POPC) is a cluster of (usually) heterogeneous systems that form a dedicated parallel system. Another name for these approaches is “Cluster Computing” 18
Flynn’s Taxonomy – Recent Expansion -SPMD Flynn’s Taxonomy has been expanded to include SPMD (single program, multiple data) architectures. Recall SIMD executes a single instruction using multiple computations at the same time – data parallelism. For SPMD, multiple independent processors execute different tasks of the same program at the same time – task parallelism. Supercomputers use this approach 19
Hardware security You only need to know Hardware Security at the level it is covered in the slidesThe slides will not be available for the exam though
Hardware security What is Roots of Trust (RoT) ? • RoT is a set of functions in the trusted computing module that is always trusted by the computer’s operating system (OS). • RoTs serves as a separate computer engine controlling cryptographic processor on a PC • Typically, RoTs are implemented in hardware rather than software because of its immutability (unchangeable object), smaller attack surface, and reliable behavior.
Reverse engineering (RE) oFmachine code for malware analysis • Disassembly allows us to analyze malware or viruses without the source code • RE will figure out the program’s flow in understanding program behavior • RE will find out virus and malware signatures used in antivirus programs (Anti-virus program determines a signature of a program and compares that signature to a list of known bad signatures) • Obfuscation code (code intentionally made hard for humans to understand) is hard to reverse engineer
Intel chipsec • CHIPSEC is a framework developed by Intel for analyzing security of PC platforms including hardware, system firmware (BIOS) and the configuration of platform components. • It allows you the ability to create security test suites, security assessment tools for various low level components and interfaces as well as forensic capabilities for firmware.
Hardware/Firmware worms What is firmware? • Firmware is a piece of software stored in read-on-memory (ROM) or flash memory that comes with hardware. Chi worm • CIH is a computer virus developed by a Taiwanese college student in 1998 (Chen Ing-hau). This virus erases the first megabyte of a hard drive and PC BIOS firmware. It causes machines to hang or cue the blue screen death. Zero out the first megabyte of a hard drive by deleting partition tables and master boot record (MBR), which causes the computers not to boot. • It hides itself in a Portable Executable (PE) file under Windows 95, 98, and ME. It does not spread via Windows NT-based operating systems such Windows XP, 7, 8, and 10.
Hardware/Firmware worms Thunderstrike 2 worm • The Thunderstrike 2 is a firmware type of worms created by Xeno Kovah et al. to prove that MACs may be attacked via an Apple Thurderbolt Ethernet adapter. • The worm hides in the Option ROM (consist of firmware called by the system BIOS) of the Thurderbolt Ethernet adapter, which is loaded and infected when the MAC’s firmware is connected to the network. • An attacker could compromise the boot firmware on MacBooks via a phishing email or malicious web site. The compromised MacBook will spread the worm by being connected to any other device. When the infected devices are inserted to other computers, they will load the optional ROM, which triggers flashing their boot firmware with the worm.
Why ARE firmware VIRUSES hard to DETECT and remove ? • Most the anti-virus software does not have the privilege to scan the firmware simply because its operations rely on the firmware. • Moreover, the firmware may disguise itself by reporting normal responses for any requests made by upper level applications. This makes it difficult to detect. • Also, the firmware is basically part of the hardware. Unless you explicitly flash (clean) and re-store the firmware, re-installing OS will not remove the worm sitting in firmware.
Latest hardware attack development • Analog malicious hardware – exploiting the analog properties of circuits (ie. Replace digital gates with analog components and then take away charge from a target wire every time it toggles and store that charge in a capacitor – the capacitor’s voltage exceed some threshold, it deploys a payload) - U of Michigan doing extensive research • These “fabrication-time” attacker can leverage analog circuits to create a hardware attack that is small (i.e., requires as little as one gate) and stealthy (i.e., requires an unlikely trigger sequence before effecting a chip’s functionality) – third party companies are typically the culprits Countermeasures: • Fingerprinting – fabrication causes microscopic variations in chips that are unpredictable • On-chip sensors – sensors can be used for monitoring • Eliminate unused space – minimize to eliminate space for hackers to place malicious code in firmware