170 likes | 335 Views
Data Representation 3. This week Recap on Floating point number ASCII unicode. Floating-point numbers.
E N D
Data Representation 3 • This week • Recap on Floating point number • ASCII • unicode
Floating-point numbers • Often we want to represent very small, very large numbers or numbers with fractional parts. For example, 33550000 or 0.00000001451. One way of doing this is scientific notation where these numbers are split into two parts a number with a decimal point within it (called the mantissa) and a power of 10 (called the exponent).
Fixed Notation Scientific Notation Mantissa Exponent 33550000 0.3355x108 0.3355 8 0.00000001451 0.1451x10-7 0.1451 -7
The decimal number 5.625 could be represented as 101.101. If we use this mantissa and exponent idea, it could also be written as 1.01101x22 (Normalised) where the exponent shows the final position of the binary point relative to the current position. • Because the binary point can be altered depending on the magnitude of the exponent, it often refereed to as a floating-point representation.
Features with floating point representation • Gives a wide range of numbers • It is not precise • Precision and Range can be improved using more bits (64 bits in Double precision) • Bit 63 for sign, • bits 52-62 for exponent • Bits 0 to 51 for mantissa
Summary on floating point numbers • Use of mantissa and exponent in floating point numbers. • Advantages and disadvantages of floating point number. • IEEE standard
Text Representation • Up to know we have looked at storing number, can be store text. • If we could not then this text would not be on the screen. • So we need a way of storing letters and characters.
ASCII • Most common text representation. • Each character has a code. • Special characters such as space, return, etc have codes. • American Standards Code for Information Interchange. • Alternatives: EBCDIC not widely used.
ASCII 0 1 2 3 4 5 6 70 NUL DCL 0 @ P ‘ p1 SOH DC1 ! 1 A Q a q2 STX DC2 “ 2 B R b r3 ETX DC3 # 3 C S c s4 EOT DC4 $ 4 D T d t5 ENQ NAK % 5 E U e u6 ACK SYN & 6 F V f v7 BEL ETB ‘ 7 G W g w8 BS CAN ( 8 H X h x9 HT EM ) 9 I Y i yA LF SUB * : J Z j zB VT ESC + ; K [ k {C FF FS , < L \ l |D CR GS - = M ] m }E SO RS . > N ^ n ~F SI US / ? O _ o DEL
ASCII • So what is the code for A? • Go to the table and find A it is on the column marked 4 and row marked 1. • This can be used to give a hexadecimal number • Column gives the higher hexadecimal number. • Row gives the low hexadecimal number. • There A is 4116 • What is this code as a decimal number? 4110 or 6510 ?
UNICODE • ASCII used 7 bits (often the 8th bit used to help check the data was transferred correctly). • Therefore, limited a small character set.
UNICODE • UNICODE is a 16-bit system, and can deal with the requirements of the modern system, with the need for different character sets for different languages and symbols (such as accents) • Java and Windows XP can support UNICODE
UNICODE • Every character has a unique 16-bit value called code point. First 0-255 map on to ASCII. • No characters made up of multiple characters such as \n. Makes programming easier all characters take the same amount of memory. • In a 16bit system there are 65 536 different code points. All the worlds languages together use 200000 characters. Does not cover all symbols.
Revision activities • What is the ASCII value of the following: • a space • letter b • How many different codes are possible with following: • ASCII • Unicode.
Further reading on text • Chalk et al (2004) pages 11-12 and 257-259. • Tanenbaum (2005) pages 127-131