180 likes | 200 Views
CSCE 121-200 Fall 2012. Prof. Jennifer L. Welch. Simplified Model of a Computer. How data is represented and stored How data is operated on How a program is stored How the program is executed on the data How programs are translated into machine language. How Data is Represented.
E N D
CSCE 121-200Fall 2012 Prof. Jennifer L. Welch
Simplified Model of a Computer • How data is represented and stored • How data is operated on • How a program is stored • How the program is executed on the data • How programs are translated into machine language CSCE 121-200
How Data is Represented • Fundamental unit of computer storage is a bit, data entity that can take on two values, 0 and 1 • Given a sequence of bits (01010001), what does it represent? • Depends on the code • Base 10 integer: 1,010,001 • Base 2 integer: 1*20 + 1*24 + 1*26 = 81 (in base 10) • ASCII: The letter ‘Q’ • Must know the code to decode a sequence • Other codes: • Negative numbers (e.g., 2’s complement) • Numbers with fractional parts (floating point) CSCE 121-200
How Data is Stored • Byte: a group of 8 bits; 28 = 256 possibilities • 00000000, 00000001, …, 11111110, 11111111 • Memory: long sequence of locations, each big enough to hold one byte, numbered 0, 1, 2, 3,… • Address: the number of the location • Contents of a location can change • Use consecutive locations to store longer sequences • e.g., 4 bytes = 1 word ….. 1 1 0 0 1 1 0 0 0 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 1 0 byte 0 byte 1 byte 2 CSCE 121-200
Limitations of Finite Data Encodings • Overflow: number is too large • suppose 1 byte stores integers in base 2, from 0 (00000000) to 255 (11111111). If the byte holds 255, then adding 1 to it results in 0, not 256 • Roundoff error: • insufficient precision (size of word): try to store 1/8, which is .001 in base 2, with only two bits • nonterminating expansions in current base: try to store 1/3 in base 10, which is .333… • nonterminating expansions in every base: irrational numbers such as π CSCE 121-200
Kinds of Storage The computer reads and writes the data stored in it. From fastest to slowest: • cache: super-fast • main memory: random access, equally fast to access any address • disk: random access, but significantly slower than main memory • tape: sequential access, significantly slower than disk CSCE 121-200
How Data is Operated On • Central processing unit (CPU) consists of: • arithmetic/logic unit (ALU): • performs operations on data (e.g., add, multiply) • registers: special memory cells, hold data used by ALU, even faster than cache • control unit: • figures out what the ALU should do next • transfers data between main memory and registers CSCE 121-200
Machine Instructions Goal: add the number stored in address 3 and the number stored in address 6; put the result in address 10. Control unit does the following: • copies data from main memory address 3 into some register, say 1: LOAD 3,1 • copies data in main memory address 6 into some register, say 4: LOAD 6,4 • tells ALU to add the contents of registers 1 and 4, and put result in some register, say 3: ADD 1,4,3 • copies data in register 3 into main memory address 10: STORE 3,10 LOAD, ADD and STORE are machine instructions. How does the control unit know which instruction is next? The program! CSCE 121-200
How a Program is Stored • Program: list of machine instructions using some agreed upon coding convention. • Example: • Program is stored same way data is stored! ADD 1 4 3 0 0 1 0 0 0 0 1 0 1 0 0 0 0 1 1 2nd operand 3rd operand 1st operand opcode CSCE 121-200
How a Program is Executed • The control unit has • instruction register: holds current instruction to be executed • program counter: holds address of next instruction in the program to be fetched from memory • Program counter tells where the computer is in the program. Usually the next instruction to execute is the next instruction in memory • Sometimes we want to JUMP to another instruction (e.g., if or while) • unconditional JUMP: always jump to address given • conditional JUMP: only jump if a certain condition is true (e.g., some register contains 0) CSCE 121-200
Machine Cycle • fetch next instruction, as indicated by the program counter (PC), and increment PC • decode the bit pattern in the instruction register – figure out which circuitry needs to be activated to perform the specified instruction • execute the specified instruction by copying data into registers and activating the ALU to do the right thing • a JUMP may cause the PC to be altered CSCE 121-200
Diagram of Architecture PC: IR: R1: R2: R3: R4: CPU: control unit ALU bus 0 1 2 3 4 5 data … main memory: second instr. third instr. first instr. 95 96 97 98 99 100 … … program CSCE 121-200
Evolution of Programming Languages • Machine languages: all in binary • machine dependent • painful for people • Assembly languages: allow symbols for operators and addresses • still machine dependent • slightly less painful • assembler translates assembly language programs into machine language • High-level languages: such as Fortran, C, Java, C++ • machine independent • easier for people • compiler translates high-level language programs into machine language CSCE 121-200
Compilation Challenges faced by a compiler: • one high-level instruction can correspond to several machine instructions • Ex: x = (y+3)/z; • fancier data structures, such as • arrays: must calculate addresses for references to array elements • structs: similar to arrays, but less regular • fancier control structures than JUMP • while, repeat, if-then-else, case/switch CSCE 121-200
Compilation Challenges faced by a compiler: • functions (a.k.a. procedures, subroutines, methods): must generate machine language code to: • copy input parameter values • save return address • start executing at beginning of function code • copy output parameter values at the end • set PC to return address CSCE 121-200
Compilation Process • Lexical analysis: break up strings of characters into logical components, called tokens, and discard comments, spaces, etc. • Ex: “total = sum + 55.32” contains 5 tokens • Parsing: decide how the tokens are related • Ex: “sum + 55.32” is an arithmetic expression, “total = sum + 55.32” is an assignment statement • Code generation: generate machine instructions for each high-level instruction. • The resulting machine language program, called object code, is written to disk CSCE 121-200
Linking • Linker: combines results of compiling different pieces of the program separately. If pieces refer to each other, these references cannot be resolved during independent compilation. • Combined code is written to disk. main … declares x … invokes p … function p … refers to x … CSCE 121-200
Loading • Loader: when it’s time to run the program, the loader copies the object code from disk into main memory • location in main memory is determined by the operating system, not the programmer • Loader initializes PC to starting location of program and adjusts JUMP addresses • Result is an executable CSCE 121-200