1 / 33

CMPT 250 Computer Architecture

CMPT 250 Computer Architecture. Instructor: Yuzhuang Hu yhu1@cs.sfu.ca. Another Design Example: PIG (Chapter 7-10). PIG is a single dice game. Two players roll the dice in turns. When 1 is rolled, the current total becomes 0. The first player to reach or exceed 100 wins. Turn Total. Player 1.

Download Presentation

CMPT 250 Computer Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CMPT 250 Computer Architecture Instructor: Yuzhuang Hu yhu1@cs.sfu.ca

  2. Another Design Example: PIG (Chapter 7-10) • PIG is a single dice game. Two players roll the dice in turns. When 1 is rolled, the current total becomes 0. The first player to reach or exceed 100 wins. Turn Total Player 1 Player 2 ROLL HOLD NEW GAME RESET

  3. Inputs, Outputs, and Registers of PIG

  4. State-Machine Diagram for PIG Default: P1=CP, P2=CP RESET DIE000, FP<-0 INIT TR1<-0, TR2<-0, CP<-FP SUR<-0 ROLL BEGIN ROLL ROLL If (DIE=110) DIE<-001 Else DIE<-(DIE+1) ROL CP<-CP ROLL DIE=1 ONE SUR<-SUR+DIE ROLL·HOLD DIE<>1 ROLL CP/(TR1<-TR1+SUR), CP/(TR1<-TR1+SUR) ROH CP(TR1<1100100) +CP(TR2<-1100100) ROLL·HOLD TEST CP<-CP FP<-FP CP(TR1>=1100100)+CP(TR2>=1100100) NEWGAME WIN CP/P1=BLINK, CP/P1=BLINK NEWGAME

  5. Algorithmic State Machine (ASM) • The ASM is like state diagrams but less formal and thus easier to be understood. An ASM chart consists of a set of blocks. Each block can be viewed as a directed graph with three types of nodes. • State Box (node). • Binary Decision Box (node). • Conditional Action Box (node).

  6. ASM contd. • State box: represented by a labeled rectangle. It may contain several register transfer statements or variables. • Binary decision box: represented by a hexagon. It indicates that a condition needs to be tested. It is similar to the input condition defined for State-Machine Diagrams. • Conditional output action box: represented by an oval box. It contains several register transfer statements or variables. It is similar to the output condition defined for State-Machine Diagrams.

  7. Boxes in ASM Charts State name Register transfer 0 (False) 1 (True) Condition statements expression (Moore type) (b) Binary Decision box (a) State box Conditional outputs or actions (Mealy type) (c) Conditional output action box

  8. A Design Example using ASM • Problem: find the sum for N numbers. algorithm sum_n(S) input: a list S consisting N numbers. output: the sum of the N numbers in S. [1] sum = 0; [2] N = get_input(); [3] while ( N > 0 ) [4] sum = sum + get_input(); [5] N = N – 1; [6] endwhile;

  9. Interface of the Sum Machine

  10. ASM Diagram for the Sum Machine S0 rdy S1 0 1 data N=0 0 1 rdy Sum <- 0 N <- in_bus ack S2 0 data s1 1 N <- N-1 Sum <- sum+in_bus ack s1

  11. Digital System Design Control Signals Control Unit Data Path Control inputs Status Signals Data outputs Control outputs Data inputs In most digital system designs, we partition the system into two types of modules: a datapath, and a controlunit.

  12. Design from ASM processor Data in Data out status control pts Clock CTRL PTS SELECTOR External control inputs SEQ

  13. Datapath of the Sum Machine in_bus ls ln SUM N cs dn out_bus FA N=0 eq0 overflow

  14. ASM Design Guidelines • Write an algorithm for the problem. • Translate the algorithm to a sequence of register transfer statements. • Group adjacent independent register transfer statements. • Draw the ASM diagram, and introduce control signals.

  15. Datapath Definition (Chapter 9) • The datapath is defined by three basic components: • A set of registers. • The micro-operations performed on data stored in the registers. • The control interface.

  16. A Generic Datapath • Four parallel-loadregisters • Two mux-based register selectors • Register destination decoder • Mux B for external constant input • Buses A and B with externaladdress and data outputs • ALU and Shifter withMux F for output select • Mux D for external data input • Logic for generating status bitsV, C, N, Z

  17. Datapath Examples • What to do for R1 <- R2 + R3? • A select, choose R2. • B select, choose R3. • G select, choose A+B. • MF select, choose the ALU output. • MD select, choose MUX F ouput. • Destination select, choose R1. • Load enable, to enable R1.

  18. Other Micro-operation Alternatives • MF=1: shift operation. • MB=1: using a constant. • Load enable=0: no register loading, e.g. when providing an address out or data out. • MD=1: read from memory.

  19. The Arithmetic/Logic Unit • ALU performs arithmetic/logic micro-operations.

  20. The Arithmetic Circuit • The arithmetic circuit consists of a parallel n-bit adder and a selection logic.

  21. Function Table for Arithmetic Circuit • It is easy to see that Yi=BiS0+BiS1.

  22. Function Table for ALU

  23. More on ALU • The ALU has a fairly high number of logic levels and contributes to propagation delay in the circuit. However simple ripple-carry adders can incur large propagation delays.

  24. Carry Look-Ahead • Carry look-ahead is designed to reduce the carry propagation delay in the ALU. • For a single bit full adder: • Generate a carry out when x=y=1:g=x·y. • Propagate the carry in through the carry out when x or y is 1: p=x xor y. • In terms of p and g, the carry out co=g+ p·ci.

  25. Full Adder With Ports p And q

  26. A Slight Optimization • Redefine p to be x+y. • We can do this because of the following reasoning: we only need to consider the case when x=y=1. However when x=y=1, g=1, therefore no matter p takes 0 or 1, co is always equal to 1.

  27. Computing the Carry In for Each Bit • ci(1)=co(0)=g(0)+p(0)·ci. • ci(2)=co(1)=g(1)+p(1) ·g(0)+p(1) ·p(0) ·ci(0). • ci(3)=co(2)=g(2)+p(2) · g(1)+p(2) · p(1) ·g(0)+p(2) · p(1) ·p(0) ·ci(0). • ci(4)=co(3)=g(3)+p(3) · g(2)+p(3) · p(2) · g(1)+p(3) · p(2) · p(1) ·g(0)+p(3) · p(2) · p(1) ·p(0) ·ci(0).

  28. Faster Four-bit Addition • p(3:0) and g(3:0) are available after 1 gate delay. • co(3:0) are available after 2 more gate delays. • s(3:0) are available after 1 more gate delay. • In total 1+2+1=4 gate delays.

  29. A 4-Bit CLA Adder

  30. Explanation of P and G • Considerthe msb position of a bit vector (3:0). Under what condition will a carry be generated out of that position? Under what condition will a carry be propagated through that position? • Define • G=g(3)+p(3) · g(2)+p(3) · p(2) · g(1)+p(3) · p(2) · p(1) · g(0) • P= p(3) · p(2) · p(1) · p(0)

  31. A 16-bit CLA Adder • Use the 4-bit CLA adder as a building box and design a second level CLA logic to build a 16-bit CLA adder.

  32. Delay of the 16-bit CLA Adder • p(15:0) and g(15:0) are available after one gate delay. • It takes 2 more gate delays for the P and G signals for each of the 4-bit box. • It takes 2 more gate delays for the second layer to produce ci(12), ci(8) and ci(4). • It takes 2 more gate delays for the first layer to produce the rest carry in values. • It takes one more gate delay for the sum. • In total 1+2+2+2+1=8 gate delays. • In general the total delay is 1+2+4( (log n)/2 )+1.

  33. Thanks!

More Related