190 likes | 331 Views
Programming in R. COURSE NOTES 2 Hoganson Language Translation. Language Translation. Computer is all 0s and 1s, which is hard for humans. So we have created languages that are easier for us (humans) to work with.
E N D
Programming in R COURSE NOTES 2 Hoganson Language Translation
Language Translation Computer is all 0s and 1s, which is hard for humans. So we have created languages that are easier for us (humans) to work with. Human-friendly languages require computer time to translate into computer-executable programs. Ongoing trend since computer was created: make better human interfaces to the machine, using the ever increasing power of the computer to do the translation work “behind the scenes”. Thing GUI interfaces and virtual-reality interfaces.
Machine Code Just a taste • Machine code – the bottom line in programming. • Machine code instructions are divided into fields, and the instruction has a specified format. • Simple example:
Machine Code • This instruction has four fields: • Instruction type (two bits) • Operation code (6 bits) • Register operand 1 (4 bits) • Register operand 2 (4 bits) • 16-bit (two-byte) instruction
Machine Code Two bits for instruction type. How many types of instructions are possible within this format? Operation Code is 6 bits. How many types of operations are possible for a format? The register operands are 4 bits each. How many different registers can be indicated with 4 bits? (similar to addressing)
Machine Code This instruction format is a Register-Register instruction. That means that it takes its inputs from two register operands. The operation is performed on those two data elements, and the result goes back into the register specified by the first register operand.
Machine Code Instruction Do you remember where the result of the addition is stored? • Machine code is not hard, just painful and slow to work with. • Register-Register instruction format is ‘00’ • Op Code to add two registers is ‘010000’ • Add contents of register 2 specify ‘0010’ • Add contents of register 4 specify ‘0100’ • Complete instruction in 0s and 1s: • 00 010000 0010 0100
Assembly Language • Working with 0s and 1s is hard – and humans are prone to making errors. • Languages have been created to make programming easier. • Assembly language is the lowest level language. • Uses mnemonics and abbreviations. • Our add two register instruction: • 00 010000 0010 0100 • Can be represented (1 to 1) with an assembly instruction: • ADR R2 R4 • ADdRegisters R2 and R4, result in R2
High-Level Languages Assembly language is a big improvement over machine code. Assembly is translated by an assembler program to 0s and 1s that the computer can work with. More powerful (and human-readable) languages have been created (which must also be translated to 0s and 1s). These are called High Level Languages Basic, Fortran, C, C++, C#, R, etc.
High-Level Languages • Our add two register instruction: • 00 010000 0010 0100 • In assembly language: • ADR R2 R4 • ADdRegisters R2 and R4, result in R2 • In a high level language might look like: • Number1 = Number1 + Number2 • Better?
Many-to-1 translation • ADR R2 R4 • High level language might look like: • Sum = Number1 + Number2 • But this high-level language has another type of translation embedded: memory addressing • Number1, Number2, and SUM are data values stored in memory, not registers. • The values for Number1 and Number 2 must be first loaded from memory into registers. • Then the add operation can be performed • Then the result stored back to memory in SUM. • Additional machine-level instructions needed to do this one high-level language instruction
High-level Language Translation • High-level language instructions must be translated/converted to machine code before the computer can run them. • This process requires a translation program: • Compiler • Interpreter • (Assembler was used for assembly language) • Languages like C, C++, Cobol, Fortran and Pascal are all compiled languages.
Compiler Compiler takes the high-level language program (as text) as its input. It produces the machine code version of the program as its output. It does not change the high-level program, the machine code program is a new file.
Interpreter • Some languages like BASIC and VisualBASIC are interpreted languages, not compiled. • The Interpreter does not convert the entire program all at once. • Instead, it converts instructions one at a time, and has the computer execute each instruction. • Slower, because every time the program is run, it must be interpreted.
Virtual Machine • A third and more recent way to translate high-level programs is with a Virtual Machine (or byte-code interpreter). Java is an example. • Separates translation into two steps. • Convert the program to “byte-code” • The “byte-code” is then interpreted by a virtual machine.
Virtual Machine The virtual machine/byte-code interpreter makes programs transportable and device-independent. Converted byte-code can move over the internet.
Virtual Machine Each different processor/machine needs its own virtual machine, which will be different from CPU to CPU. Different because of different machine codes and operating systems.
“R” is A structured programming language (no objects or agents) With extensions for Big Data – functions and techniques for manipulating large data sets using parallel opportunities. An interpreted language, running on a Virtual Machine written in a language called “S”. S code is compiled, using a complier for the platform. The “R” interpreter is compiled “S” code.
End of Lecture End Of Today’s Lecture.