Computation for Physics 計算物理概論

Computation for Physics計算物理概論 數位資料表示法

Numeral systems計數系統

Bit=Binary digit • Bit=0,1 • b, bit • Byte=Eight bits • B • K=Kilo=1000=103 or 1024=210 • M=Mega=106or 1024=220 • G=Giga=109or 1024=230 • T=Tera=1012or 1024=240 • KB=kilobyte, MB=kilobyte, GB=gigabyte

Integer

Unsigned integer • Map N-bits to 0,1,…,2N-1=2N possible numbers • Unsigned byte=8 bits • 0,…,255 • Unsigned short integer=16 bits • 0,…,65535 • Unsigned integer=32 bits • 0,…,4,294,967,295 • Unsigned long integer=64 bits • 0,…, 18,446,744,073,709,551,615

Most and leastsignificant bit a 8-bits number Most significant bit=msb Least significant bit=lsb

Signed integer How to represent negative numbers in N-bits without “-” ? • Sign-and-magnitude method • Use msbas “sign”, N-1 bits as “magnitude” • Two representations for zero • One’s complement • Use “bit complement” as the arithmetic negative • Two representations for zero • Two’s complement • Use “two’s complement of the absolute value” • One representation for zero

Sign-and-magnitude sign magnitude = = = =

Sign-and-magnitude • Addition of x and y • If x and y have the same sign  sgn(x)(x+y) • If x and y have different sign • If |x|>|y|  sgn(x)*(x-y) • If |y|>|x|  sgn(y)*(y-x) • Subtraction x and y=Addition of x and (-y) • If overflow return error message

One’s complement = = = =

One’s complement

One’s complementAddition

One’s complementend-around carry

One’s complementnegative zero

TWO’s complement = = = = =

two’s complement

Two’s ComplementAddition

Two’s ComplementAddition ignore

Two’s ComplementAddition

Two’s ComplementAddition ignore

Two’s ComplementAddition Positive+Positive=Negative! overflow  error message

Most and leastsignificant bit a 8-bits number Most significant bit=msb Least significant bit=lsb

Logic gates

Logic gates邏輯閘

And Gate

OR Gate

NOT Gate

NAND Gate

NOR Gate

XOR Gate

XNOR Gate

Adder Full Adder Half Adder

Real number

Radix point小數點 • Base 10 notation  decimal point • Base 2 notation  binary point Radix point

Scientific notation科學記號

Floating pointrepresentation浮點數表示法 exponent significand (mantissa)

IEEE 754 • IEEE • Institute of Electrical and Electronics Engineers • IEEE 754 • IEEE Standard for Floating-Point Arithmetic • Arithmetic formats: sets of binary and decimal floating-point data, which consist of finite numbers (including signed zeros and subnormal numbers), infinities, and special "not a number" values (NaNs) • Interchange formats: encodings (bit strings) that may be used to exchange floating-point data in an efficient and compact form • Rounding rules: properties to be satisfied when rounding numbers during arithmetic and conversions • Operations: arithmetic and other operations on arithmetic formats • Exception handling: indications of exceptional conditions (such as division by zero, overflow, etc.)

IEEE 754 binary16Half precision • Sign bit: 1 bit • Exponent width: 5 bits • Significand precision: 11 (10 explicitly stored) • Exponent encoding • Offset=15, Emin=-14,Emax=15 • Minimum positive value=2^-14 • Maximum positive value=(2-2^-10)2^15=65504 • Minimum subnormal value=2^-24≈5.96×10^-5

IEEE 754 binary32 single precision • Sign bit: 1 bit • Exponent width: 8 bits • Significand precision: 24 (23 explicitly stored) • Exponent encoding • Offset=127, Emin=-126,Emax=127 • Minimum positive value≈2.2 ×10^-308 • Maximum positive value≈1.8 ×10^308 • Minimum subnormal value≈4.9 ×10^-324

IEEE 754 binary64Double precision • Sign bit: 1 bit • Exponent width: 11 bits • Significand precision: 53 (52 explicitly stored) • Exponent encoding • Offset=1023, Emin=-1022,Emax=1023 • Minimum positive value=2^-126≈1.18 ×10^-38 • Maximum positive value=(2-2^-23)2^127≈3.4 ×10^38 • Minimum subnormal value=2^-149≈1.4 ×10^-45

Representation error

Rounding Errors • In base-10 system • ½=0.5 • 1/3=0.333333333333333333333333333333333333333 • In base-2 system • Terminating iff denominators are powers of 2 (1/2, 3/16)

IEEE Rounding Modes • Truncation: • Keep the desired number of digits unchanged, removing all less-significant digits; also called rounding toward zero. • 0.142857 ≈ 0.142 (All digits less significant than the third removed). • Round to Nearest: (Default) • Round to the nearest valid representation. Break ties by rounding to an even digit • +23.524, +24.524; -23.5-24, -24.5 -24 (symmetry between +/- numbers) • Round to Nearest: • Round to the nearest valid representation. Break ties by rounding away from zero. • +23.524,+24,525; -23.5-24, -24.5 -25 (symmetry between +/- numbers) • Round to −∞: • Round to a value less than or equal to the original number. If the original number is positive, this is equivalent to truncation. • Round to +∞: • Round to a value greater than or equal to the original number. If the original number is negative, this is equivalent to truncation.

IEEE 754Special values • Positive infinity • Negative infinity • (positive zero=ordinary zero) • Negative zero: -0 • NaNs: “Not a number” values • Subnormal numbers

Signed zero • Sign=0+ or 1- • Exponent=0 • Significand(Mantissa)=0 • Arithmetic • , , • , • , • NaN, NaN

Signed infinity • Sign=0+ or 1- • Exponent=maximum value • 111112 • FFH • 7FF16 • Significand(Mantissa)=0

NaNs • Sign  quiet, signaling • Exponent=maximum value • 111112 • FFH • 7FF16 • Significand(Mantissa)≠0 • Creation • NaN, NaN,NaN • NaN, =NaN • =1 or NaN • Square root or logarithm of negative number • Inverse sine or cosine of a number with absolute value greater than 1

Subnormal numbers • Sign=0+ or 1- • Exponent=0 • 000002 • 00H • 00016 • Significand(Mantissa)≠0

Floating pointaddition • x=123456.7 = 1.234567 × 10^5 • y=101.7654 = 1.017654 × 10^2 = 0.001017654 × 10^5 • x+y=(1.234567+0.001017654) × 10^5 • x+y=1.235584654 × 10^5 • x+y≈1.235585 × 10^5 • Round-off error!

Floating point addition of a small number • x=1.234567× 10^5 • y=9.876543× 10^-3=0.00000009876543 × 10^5 • x+y=(1.234567+0.00000009876543) × 10^5 • x+y=1.23456709876543 × 10^5 • x+y≈1.234567 × 10^5=x • Round-off error!

Computation for Physics 計算物理概論