Les 4: Gegevensmanipulatie-instructies en machinemodellen

Les 4: Gegevensmanipulatie-instructies en machinemodellen “Everything that can be invented has been invented.” — Charles H. Duell, Commissioner, U.S. Office of Patents, 1899

BUS Von Neumann-machineFysiek zicht BUS adres adres CVE Controle controle cache data data klok klok controle controle ALU ALU registers registers Geheugen Geheugen:bau-cellen RAM Invoer/Uitvoer Invoer/Uitvoer

Overzicht • Aritmetische instructies • Logische instructies • Vlottende-komma-instructies • MMX-instructies • SSE/SSE2-4-instructies • Varia • Machinemodellen

O1 O2 controle Operatorgedeelte R S c-bit s-bit o-bit z-bit

s c Toestandsbits bij 32-bit optelling R = 01 + 02 O1: 00111000101001010101010100001100 O2: 01001110001010010101010101100110 R: 10000110110011101010101001110010 z-bit = 1 indien resultaat = 0

overflow 010 2 +011 3 101 5 010 2 +001 1 011 3 110 -2 +101 -3 1011 -5 110 -2 +111 -1 1101 -3 Overflow 100 101 110 111 000 001 010 011 -4 -3 -2 -1 0 1 2 3 o-bit = carry(31) xor carry(30)

0 -1 1 aftrekken optellen -2 2 -3 3 -4 Getallencirkel 000 111 001 010 110 011 101 100

010 2 010 2 100 -4 110 -2 101 -3 1011 3 + + 011 3 (max) 100 -4 (min) Saturatierekenen Overflow kan ook opgevangen worden door de waarden te plafonneren op de extremen 100 101 110 111 000 001 010 011 -4 -3 -2 -1 0 1 2 3

Saturatierekenen saturatie overflow modulo

Resoluties hoofdbewerkingen • Som: n bit + n bit  (n+1) bit • Verschil: n bit - n bit  (n+1) bit • Product: n bit * n bit  (2n) bit • Deling: n bit / n bit  n bit

80924010 1 Optelling 32 00 04 08 0C 10 00B00300 80000000 00330020 0003300A …. add doel, bron add eax, ebx add eax, 10h eax 00920000 00924000 00924010 add eax, [4] ebx 00004000 Instructie:add 0 0 0 0 s o c z Alle waarden zijn hexadecimaal!

8091BFF0 1 1 1 Aftrekking 32 00 04 08 0C 10 00B00300 80000000 00330020 0003300A …. sub doel, bron sub eax, ebx sub eax, 10h eax 00920000 0091C000 0091BFF0 sub eax, [4] ebx 00004000 Instructie:sub 0 0 0 0 s o c z Alle waarden zijn hexadecimaal!

Integer hoofdbewerkingen d = d + s d = d + s + c d = d - s d = d - s - c vermenigvuldiging (unsigned) vermenigvuldiging (signed) deling (unsigned) deling (signed) d = -d d = d + 1 d = d - 1 add d,s adc d,s sub d,s sbb d,s mul s imul s div s idiv s neg d inc d dec d

ebx [14h] eax [10h] [1ch] [18h] + [24h] [20h] 64-bit optelling 10h 14h 18h 1ch 20h 24h 28h mov eax,[10h] mov ebx,[14h] add eax,[18h] adc ebx,[1ch] mov [20h],eax mov [24h],ebx

mul & div mul bron edx:eax = bron * eax (bin) imul bron edx:eax = bron * eax (2c) div bron eax = edx:eax / bron (bin) edx = edx:eax % bron Instructie:mul Instructie:div deeltal = quotiënt * deler + rest en rest * deeltal ≥ 0.

imul: 3 varianten imul bron edx:eax = bron * eax imul d,bron d = d * bron imul d,bron1,bron2 d = bron1 * bron2

Product in helften b.v. Alpha (64 bit) • mulq a,b,c • reg[c]:=(reg[a]*reg[b])<63:0> • umulh a,b,c reg[c]:=(reg[a]*reg[b])<127:64> <n1:n2> bits n1 tot n2 (neerwaarts genummerd)

Deling Indien deling door een constante, kan dit ook door vermenigvuldiging, b.v. deling door 15 X/15 =X*(1/15) 1/15 = 0.0001000100010001…2= 0.111116 50/15 = 003216 * 0.111116 = 0003.555216 45/15 = 002D16 * 0.111116 = 0002.FFFD16

Vergelijkingen d-s  vlaggen d ‘and’ s  vlaggen cmp d,s test d,s Instructie:cmd Instructie:test cmp: vergelijken van waarden: >, =, < test: testen van bits: aan, uit

c a nbe (c or z) == 0 above ae nb c == 0 above or equal b nae c == 1 below be na (c or z) == 1 below or equal binair Natuurlijke getallen (binair) 000 0 001 1 010 2 011 3 100 4 101 5 110 6 111 7 010 2 -011 3 111 ? 011 3 -010 2 001 1

s == 1 s == 0 o == 0 o == 1 g nle ((s xor o) or z) == 0 greater ge nl (s xor o) == 0 greater or equal l nge (s xor o) == 1 less le ng ((s xor o) or z) == 1 less or equal 2-comp Gehele getallen (2-complement) 100 -4 101 -3 110 -2 111 -1 000 0 001 1 010 2 011 3 010 2 -011 3 111 -1 011 3 -010 2 001 1 010 2 -101 -3 101 ? 101 -3 -010 2 011 ?

Conditiecodes z z == 1 zero c c == 1 carry o o == 1 overflow p p == 1 parity s s == 1 sign nz z == 0 no zero nc c == 0 no carry no o == 0 no overflow np p == 0 no parity ns s == 0 no sign g nle ((s xor o) or z) == 0 greater ge nl (s xor o) == 0 greater or equal l nge (s xor o) == 1 less le ng ((s xor o) or z) == 1 less or equal e z == 1 equal ne z == 0 not equal a nbe (c or z) == 0 above ae nb c == 0 above or equal b nae c == 1 below be na (c or z) == 1 below or equal 0100 0010 0000 1010 1000 0101 0011 0001 1011 1001 1111 1101 1100 1110 0100 0101 0111 0011 0010 0110 2-comp binair

Logische Operaties d = d ‘and’ s d = d ‘or’ s d = d ‘xor’ s d = ‘not’ d and d,s or d,s xor d,s not d

Verschuivingen shl d,n shr d,n sal d,n sar d,n shld d,s,n shrd d,s,n rol d,n ror d,n rcl d,n rcr d,n d = d << n d = d >> n aritmetisch d = d << n aritmetisch d = d >> n d = (d:s << n)<63:32> d = (s:d >> n)<31:0> bitrotatie naar links bitrotatie naar rechts uitgebreide bitrotatie naar links uitgebreide bitrotatie naar rechts

101000100101010101010100110 001001010101010101001100000 Schuifoperaties 110100010010101010101010011 Schuif 1 positie naar links C=1 Schuif 5 posities naar links C=0 Instructie: schuifoperatie

Schuifoperaties SHL 0 C SHR 0 C SAL 0 C SAR C

Schuifoperaties SHLD C D S SHRD S D C

Rotatieoperaties ROL C ROR C RCL C RCR C Instructie: rotatieoperatie

rechts 1   2 rechts 1   2 links 1  x 2 links 1  x 2 Schuifoperaties: logisch 00011 3 00110 6 01100 12 00110 6 00011 3 links n  x 2nrechts n   2n

rechts 1   2 rechts 1   2 rechts 1   2 rechts 1   2 links 1  x 2 links 1  x 2 Schuifoperaties: aritmetisch 11101 -3 11010 -6 10100 -12 11010 -6 11101 -3 11110 -2 11111 -1

000011 001011 101011 shl 2 001100 101100 101100 000011 001011 shr 2 001011 Oefening 000011 001011 101011 sal 2 001100 101100 101100 111011 sar 2 000011 111011

Voorbeeld j := i*40+10 ; reg[eax] := mem32[i] mov eax, [10] mov ebx, eax shl ebx, 3 shl eax, 5 add eax, ebx add eax, 10 mov [14], eax ; reg[ebx] := ‘i’ ; reg[ebx] := ‘i’*8 ; reg[eax] := ‘i’*32 ; reg[eax] := ‘i’*8+’i’*32 ; reg[eax] := ‘i’*8+’i’*32+10 ; mem32[j] := reg[eax]

Product door verschuivingen 011 0100 0101 0110 0111 01000 01001 01010 01011 01100 101 1100 1011 1010 1001 1000 10111 10110 10101 10100 x4-x1 x2+x1 x4 x 3 x 4 x 5 x 6 x 7 x 8 x 9 x 10 x 11 x 12 x4+x1 x4+x2 x8-x2 x8-x1 x8 x8+x1 x8+x2 x8+x2+x1 x16-x4-x1 x16-x4 x8+x4

Verschuivingen Soms combinatie van de twee: x 57 = x64 - x8 + x1 x 113 = x128 - x16 + x1

Bit test instructies bt d, o bts d, o btr d, o btc d, o c = d<o:o> c = d<o:o>; d<o:o> = 1 c = d<o:o>; d<o:o> = 0 c = d<o:o>; d<o:o> = d<o:o> Instructie: bitoperaties

Bit scan instructies bsf d, s bsr d, s d = minst significante 1-bit d = meest significante 1-bit d = index van bit, neerwaarts genummerd Indien allemaal 0: d = onbepaald, z=1

Set byte on condition d = conditiecode set<cc> d setge al seto ah reg[al] = ge ? 1 : 0 reg[ah] = o ? 1 : 0 setno al dec al reg[al] = o ? 0ffh : 00h

st(4) st(5) st(6) st(7) st st(1) st(2) st(3) Vlottende-komma-registerverzameling 79 78 64 63 0 s exp mantisse Registers: vlottende-komma

Datatypes • FP Enkelvoudige precisie (32 bit) • FP Dubbele precisie (64 bit) • FP Dubbele extended precisie (80 bit) • Integer woord (16 bit) • Integer dubbelwoord (32 bit) • Integer quadwoord (64 bit) • BCD 18 nibbles + tekenbyte (80 bit)

Adresseermodes Zuivere stapeladressering fadd st(1) = st+st(1); pop Registeradressering fadd st(i) st = st+st(i) fadd st(i),st st(i) = st+st(i) fadd st,st(i) st = st(i)+st faddp st(i),st st(i) = st+st(i); pop Geheugenadressering fadd <ae> st = st+mem[ae]

Vlottende-kommabewerkingen FP optelling verschil deling rest na deling vermenigvuldiging absolute waarde st = abs(st) negatie st = - st FP compare and set eflags fadd fsub fdiv fprem fmul fabs fchs fcomi

Wiskundige functies fsqrt fsin fcos fptan fpatan fyl2x fyl2xp1 f2xm1 vierkantswortel sinus cosinus tangens arcus tangens logaritme st(1) = st(1)*log2(st); pop logaritme st(1) = st(1)*log2(st+1); pop exponent st = 2st-1

Load-store operaties fbld fbstp fild fist fld fst Load/push BCD-getal (ld=load) Store BCD and pop Load/push integer 16/32/64 bit Store integer 16/32/64 bit Load/push floating point value 32/64/80 bit Store floating point value 32/64/80 bit

Load constant fld1 fldl2t fldl2e fldpi fldlg2 fldln2 fldz push 1.0 push log210 push log2e push π push log102 push loge2 push +0.0

Aanleiding • Multimedia-applicaties en het web • Beeld (10 kiB) • Geluid (100 kiB) • Video (MiB). • Opkomst van 64-bit processors • Kleine data-elementen (8 of 16 bit) • Zeerveel data-elementen • Onafhankelijke data-elementen

64 bit Maar 64 bit 00 00 00 00 00 00 00 64 bit 00 00 00 00 00 00 Oplossing:

Multimedia-extensies • Sparc: Visual Instruction Set • PA-RISC: MAX-2 • X64: MMX, SSE • PowerPC: Altivec

Les 4: Gegevensmanipulatie-instructies en machinemodellen