1 / 47

Branch instructions (1)

31. 28. 27. 25. 24. 23. 0. Cond 1 0 1 L Offset . Link bit 0 = Branch 1 = Branch with link. Condition field. Branch instructions (1). Branch : B{<cond>} label

Download Presentation

Branch instructions (1)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 31 28 27 25 24 23 0 Cond 1 0 1 L Offset Link bit 0 = Branch 1 = Branch with link Condition field Branch instructions (1) • Branch : B{<cond>} label • Branch with Link : BL{<cond>} sub_routine_label • The offset for branch instructions is calculated by the assembler: • By taking the difference between the branch instruction and the target address minus 8 (to allow for the pipeline). • This gives a 26 bit offset which is right shifted 2 bits (as the bottom two bits are always zero as instructions are word – aligned) and stored into the instruction encoding. • This gives a range of ± 32 Mbytes.

  2. Branch instructions (2) • When executing the instruction, the processor: • shifts the offset left two bits, sign extends it to 32 bits, and adds it to PC. • Execution then continues from the new PC, once the pipeline has been refilled. • The "Branch with link" instruction implements a subroutine call by writing PC-4 into the LR of the current bank. • i.e. the address of the next instruction following the branch with link (allowing for the pipeline). • To return from subroutine, simply need to restore the PC from the LR: • MOV pc, lr • Again, pipeline has to refill before execution continues. • The "Branch" instruction does not affect LR. • Note: Architecture 4T offers a further ARM branch instruction, BX • See Thumb Instruction Set Module for details.

  3. Data processing Instruction Format

  4. Data processing Instructions • Largest family of ARM instructions, all sharing the same instruction format. • Contains: • Arithmetic operations • Comparisons (no results - just set condition codes) • Logical operations • Data movement between registers • Remember, this is a load / store architecture • These instruction only work on registers, NOT memory. • They each perform a specific operation on one or two operands. • First operand always a register - Rn • Second operand sent to the ALU via barrel shifter. • We will examine the barrel shifter shortly.

  5. Arithmetic Operations • Operations are: • ADD operand1 + operand2 • ADC operand1 + operand2 + carry • SUB operand1 - operand2 • SBC operand1 - operand2 + carry -1 • RSB operand2 - operand1 • RSC operand2 - operand1 + carry - 1 • Syntax: • <Operation>{<cond>}{S} Rd, Rn, Operand2 • Examples • ADD r0, r1, r2 • SUBGT r3, r3, #1 • RSBLES r4, r5, #5

  6. Comparisons • The only effect of the comparisons is to • UPDATE THE CONDITION FLAGS. Thus no need to set S bit. • Operations are: • CMP operand1 - operand2, but result not written • CMN operand1 + operand2, but result not written • TST operand1 AND operand2, but result not written • TEQ operand1 EOR operand2, but result not written • Syntax: • <Operation>{<cond>} Rn, Operand2 • Examples: • CMP r0, r1 • TSTEQ r2, #5

  7. Logical Operations • Operations are: • AND operand1 AND operand2 • EOR operand1 EOR operand2 • ORR operand1 OR operand2 • BIC operand1 AND NOT operand2 [ie bit clear] • Syntax: • <Operation>{<cond>}{S} Rd, Rn, Operand2 • Examples: • AND r0, r1, r2 • BICEQ r2, r3, #7 • EORS r1,r3,r0

  8. Data Movement • Operations are: • MOV operand2 • MVN NOT operand2 Note that these make no use of operand1. • Syntax: • <Operation>{<cond>}{S} Rd, Operand2 • Examples: • MOV r0, r1 • MOVS r2, #10 • MVNEQ r1,#0

  9. Conditional Execution • Most instruction sets only allow branches to be executed conditionally. • However by reusing the condition evaluation hardware, ARM effectively increases number of instructions. • All instructions contain a condition field which determines whether the CPU will execute them. • Non-executed instructions soak up 1 cycle. • Still have to complete cycle so as to allow fetching and decoding of following instructions. • This removes the need for many branches, which stall the pipeline (3 cycles to refill). • Allows very dense in-line code, without branches. • The Time penalty of not executing several conditional instructions is frequently less than overhead of the branch or subroutine call that would otherwise be needed.

  10. The Condition Field 1001 = LS - C clear or Z (set unsigned lower or same) 1010 = GE - N set and V set, or N clear and V clear (>or =) 1011 = LT - N set and V clear, or N clear and V set (>) 1100 = GT - Z clear, and either N set and V set, or N clear and V set (>) 1101 = LE - Z set, or N set and V clear,or N clear and V set (<, or =) 1110 = AL - always 1111 = NV - reserved. 28 24 20 16 4 0 31 12 8 Cond 0000 = EQ - Z set (equal) 0001 = NE - Z clear (not equal) 0010 = HS / CS - C set (unsigned higher or same) 0011 = LO / CC - C clear (unsigned lower) 0100 = MI -N set (negative) 0101 = PL - N clear (positive or zero) 0110 = VS - V set (overflow) 0111 = VC - V clear (no overflow) 1000 = HI - C set and Z clear (unsigned higher)

  11. Using and updating the Condition Field • To execute an instruction conditionally, simply postfix it with the appropriate condition: • For example an add instruction takes the form: • ADD r0,r1,r2 ; r0 = r1 + r2 (ADDAL) • To execute this only if the zero flag is set: • ADDEQ r0,r1,r2 ; If zero flag set then… ; ... r0 = r1 + r2 • By default, data processing operations do not affect the condition flags (apart from the comparisons where this is the only effect). To cause the condition flags to be updated, the S bit of the instruction needs to be set by postfixing the instruction (and any condition code) with an “S”. • For example to add two numbers and set the condition flags: • ADDS r0,r1,r2; r0 = r1 + r2 ; ... and set flags

  12. Conditional Execution cont. • Check the conditional field of CPSR and the conditional field of current instruction. • If the condition matches, current instruction is executed; otherwise, current instruction execution is aborted.

  13. Conditional Execution cont. • Reducing the number of branches • MOVS r0, r1, LSR #1 ; C(flag) := r1[0] • MOVCC r0, #10 ; if C=0, then r0 := 10 • MOVCS r0, #11 ; if C=1, then r0 := 11 • MOVS r0, r4 ; if r4==0 then r0 := 0 • MOVNE r0, #1 ; else r0 := 1

  14. ARM instruction set • ARM versions. • ARM assembly language. • ARM programming model. • ARM memory organization. • ARM data operations. • ARM flow of control.

  15. ARM versions • ARM architecture has been extended over several versions. • We will concentrate on ARM7.

  16. ARM assembly language • Fairly standard assembly language: LDR r0,[r8] ; a comment label ADD r4,r0,r1

  17. N Z C V ARM programming model r0 r8 r1 r9 0 31 r2 r10 CPSR r3 r11 r4 r12 r5 r13 r6 r14 r7 r15 (PC)

  18. Endianness • Relationship between bit and byte/word ordering defines endianness: bit 31 bit 0 bit 0 bit 31 byte 3 byte 2 byte 1 byte 0 byte 0 byte 1 byte 2 byte 3 little-endian big-endian

  19. ARM data types • Word is 32 bits long. • Word can be divided into four 8-bit bytes. • ARM addresses cam be 32 bits long. • Address refers to byte. • Address 4 starts at byte 4. • Can be configured at power-up as either little- or bit-endian mode.

  20. ARM status bits • Every arithmetic, logical, or shifting operation sets CPSR bits: • N (negative), Z (zero), C (carry), V (overflow). • Examples: • -1 + 1 = 0: NZCV = 0110. • 231-1+1 = -231: NZCV = 0101.

  21. ARM data instructions • Basic format: ADD r0,r1,r2 • Computes r1+r2, stores in r0. • Immediate operand: ADD r0,r1,#2 • Computes r1+2, stores in r0.

  22. ADD, ADC : add (w. carry) SUB, SBC : subtract (w. carry) RSB, RSC : reverse subtract (w. carry) MUL, MLA : multiply (and accumulate) AND, ORR, EOR BIC : bit clear LSL, LSR : logical shift left/right ASL, ASR : arithmetic shift left/right ROR : rotate right RRX : rotate right extended with C ARM data instructions

  23. Data operation varieties • Logical shift: • fills with zeroes. • Arithmetic shift: • fills with ones. • RRX performs 33-bit rotate, including C bit from CPSR above sign bit.

  24. ARM comparison instructions • CMP : compare • CMN : negated compare • TST : bit-wise test • TEQ : bit-wise negated test • These instructions set only the NZCV bits of CPSR.

  25. ARM move instructions • MOV, MVN : move (negated) MOV r0, r1 ; sets r0 to r1

  26. ARM load/store instructions • LDR, LDRH, LDRB : load (half-word, byte) • STR, STRH, STRB : store (half-word, byte) • Addressing modes: • register indirect : LDR r0,[r1] • with second register : LDR r0,[r1,-r2] • with constant : LDR r0,[r1,#4]

  27. ARM ADR pseudo-op • Cannot refer to an address directly in an instruction. • Generate value by performing arithmetic on PC. • ADR pseudo-op generates instruction required to calculate address: ADR r1,FOO

  28. Example: C assignments • C: x = (a + b) - c; • Assembler: ADR r4,a ; get address for a LDR r0,[r4] ; get value of a ADR r4,b ; get address for b, reusing r4 LDR r1,[r4] ; get value of b ADD r3,r0,r1 ; compute a+b ADR r4,c ; get address for c LDR r2[r4] ; get value of c

  29. C assignment, cont’d. SUB r3,r3,r2 ; complete computation of x ADR r4,x ; get address for x STR r3[r4] ; store value of x

  30. Example: C assignment • C: y = a*(b+c); • Assembler: ADR r4,b ; get address for b LDR r0,[r4] ; get value of b ADR r4,c ; get address for c LDR r1,[r4] ; get value of c ADD r2,r0,r1 ; compute partial result ADR r4,a ; get address for a LDR r0,[r4] ; get value of a

  31. C assignment, cont’d. MUL r2,r2,r0 ; compute final value for y ADR r4,y ; get address for y STR r2,[r4] ; store y

  32. Example: C assignment • C: z = (a << 2) | (b & 15); • Assembler: ADR r4,a ; get address for a LDR r0,[r4] ; get value of a MOV r0,r0,LSL 2 ; perform shift ADR r4,b ; get address for b LDR r1,[r4] ; get value of b AND r1,r1,#15 ; perform AND ORR r1,r0,r1 ; perform OR

  33. C assignment, cont’d. ADR r4,z ; get address for z STR r1,[r4] ; store value for z

  34. Additional addressing modes • Base-plus-offset addressing: LDR r0,[r1,#16] • Loads from location r1+16 • Auto-indexing increments base register: LDR r0,[r1,#16]! • Post-indexing fetches, then does offset: LDR r0,[r1],#16 • Loads r0 from r1, then adds 16 to r1.

  35. ARM flow of control • All operations can be performed conditionally, testing CPSR: • EQ, NE, CS, CC, MI, PL, VS, VC, HI, LS, GE, LT, GT, LE, AL, NV • Branch operation: B #100 • Can be performed conditionally.

  36. Example: if statement • C: if (a > b) { x = 5; y = c + d; } else x = c - d; • Assembler: ; compute and test condition ADR r4,a ; get address for a LDR r0,[r4] ; get value of a ADR r4,b ; get address for b LDR r1,[r4] ; get value for b CMP r0,r1 ; compare a < b BGE fblock ; if a >= b, branch to false block

  37. If statement, cont’d. ; true block MOV r0,#5 ; generate value for x ADR r4,x ; get address for x STR r0,[r4] ; store x ADR r4,c ; get address for c LDR r0,[r4] ; get value of c ADR r4,d ; get address for d LDR r1,[r4] ; get value of d ADD r0,r0,r1 ; compute y ADR r4,y ; get address for y STR r0,[r4] ; store y B after ; branch around false block

  38. If statement, cont’d. ; false block fblock ADR r4,c ; get address for c LDR r0,[r4] ; get value of c ADR r4,d ; get address for d LDR r1,[r4] ; get value for d SUB r0,r0,r1 ; compute a-b ADR r4,x ; get address for x STR r0,[r4] ; store value of x after ...

  39. Conditional Instruction Use ; Compute and test the condition ADR r4, a ; get address for a LDR r0, [r4] ; get value of a ADR r4, b ; get address for b LDR r1, [r4] ; get value of b CMP r0, r1 ; compare a < b ; Notice we don't need a branch here

  40. Example: Conditional instruction implementation ; true block MOVLT r0,#5 ; generate value for x ADRLT r4,x ; get address for x STRLT r0,[r4] ; store x ADRLT r4,c ; get address for c LDRLT r0,[r4] ; get value of c ADRLT r4,d ; get address for d LDRLT r1,[r4] ; get value of d ADDLT r0,r0,r1 ; compute y ADRLT r4,y ; get address for y STRLT r0,[r4] ; store y

  41. Conditional instruction implementation, cont’d. ; false block ADRGE r4,c ; get address for c LDRGE r0,[r4] ; get value of c ADRGE r4,d ; get address for d LDRGE r1,[r4] ; get value for d SUBGE r0,r0,r1 ; compute a-b ADRGE r4,x ; get address for x STRGE r0,[r4] ; store value of x

  42. Example: switch statement • C: switch (test) { case 0: … break; case 1: … } • Assembler: ADR r2,test ; get address for test LDR r0,[r2] ; load value for test ADR r1,SwitchTable ; load address for switch table LDR r1,[r1,r0,LSL #2] ; index switch table SwitchTable DCD case0 DCD case1 ; DCD directive instructs the assembler ; to reserve a word of store and ; initialize to the right side value ...

  43. Example: FIR filter • C: for (i=0, f=0; i<N; i++) f = f + c[i]*x[i]; • Assembler ; loop initiation code MOV r0,#0 ; use r0 for I MOV r8,#0 ; use separate index for arrays ADR r2,N ; get address for N LDR r1,[r2] ; get value of N MOV r2,#0 ; use r2 for f

  44. FIR filter, cont’.d ADR r3,c ; load r3 with base of c ADR r5,x ; load r5 with base of x ; loop body loop LDR r4,[r3,r8] ; get c[i] LDR r6,[r5,r8] ; get x[i] MUL r4,r4,r6 ; compute c[i]*x[i] ADD r2,r2,r4 ; add into running sum ADD r8,r8,#4 ; add one word offset to array index ADD r0,r0,#1 ; add 1 to i CMP r0,r1 ; exit? BLT loop ; if i < N, continue

  45. ARM subroutine linkage • Branch and link instruction: BL foo • Copies current PC to r14. • To return from subroutine: MOV r15,r14 Build a stack for nested calls and to pass parameters.

  46. Nested subroutine calls • void f1 (int a) { f2(a) } • Nesting/recursion requires coding convention: f1 LDR r0,[r13] ; load arg into r0 from stack ; call f2() STR r13!,[r14] ; store f1’s return adrs STR r13!,[r0] ; store arg to f2 on stack BL f2 ; branch and link to f2 ; return from f1() SUB r13,#4 ; pop f2’s arg off stack LDR r13!,r15 ; restore register and return

  47. Summary • Load/store architecture • Most instructions are RISCy, operate in single cycle. • Some multi-register operations take longer. • All instructions can be executed conditionally.

More Related