110 likes | 353 Views
Mini-Pascal. Compiling Mini-Pascal (MPC) language Subset of the Pascal programming language Somewhat similar to the Java and “C” programming languages There are many differences, however Differences make it much easier to compile ☺
E N D
Mini-Pascal • Compiling Mini-Pascal (MPC) language • Subset of the Pascal programming language • Somewhat similar to the Java and “C” programming languages • There are many differences, however • Differences make it much easier to compile ☺ • We will discuss details of the actual language when it becomes important
Scanning • 1st stage of compiling a program • Code written in a programming language • High-level languages are supposed to resemble English… • …but they don’t. • Contain many features specifically designed for the computer • How often do you use a semi-colon?
Scanning • Raw text is hard for computer to understand • Much easier for it to work with objects • Scanning converts text into tokens • Object encoding a single text idea • This is a very common problem, not just in compilers
Scanning • Only consider token being processed • This stage of compilation only generates tokens • Looks for obvious lexical errors --- text that cannot be legal • Does not track past tokens • Does not worry if text has any real meaning • Understanding meaning occurs later in the process
Lexical analysis for tokens in English • Legal:? “”, Snap p. 35 crackle < pop ppo quack! • Illegal:¡ I am excited ! • Illegal:gemütlichkeit façade
Lexical analysis for tokens in English • Legal:? “”, Snap p. 35 crackle < pop ppo quack! • Illegal:¡ I am excited ! • Illegal:gemütlichkeit façade
Types of Tokens in Mini-Pascal • Operator • All the meaningful symbols in Mini-Pascal: • Numerical: + - * ^ • Comparative: < > <= >= <> == • Separator: ( ) [ ] . ; , • Assignment: := • Spaces are meaningful • “:=” is one token • “: =” is two tokens
Types of Tokens in Mini-Pascal • Int • Includes all numbers defined by Mini-Pascal • Mini-Pascal does not include real numbers • Int token includes an uninterrupted series of integers • “1354934573212” is one token • “13 45” is two tokens – for “13” and “45” • “13.65” is three tokens – for “13”, “.”, and “65” • “2,585” is three tokens – for “2”, “,”, and “585”
Types of Tokens in Mini-Pascal • String • The literal strings in Mini-Pascal • Java strings begin and end with double quote (“”) • Pascal strings begin and end with single quote (‘’) • Can include any set of characters, letters, and numbers, but cannot go across multiple lines • ‘Hi Mom. #1’ is one token -- “Hi Mom. #1” • Note: The quotes are not included in the token
Types of Tokens in Mini-Pascal • Identifier/Id • Includes keywords (reserved) in Mini-Pascal:and array begin case const div do downto else end for function if mod nil not of or procedure program record repeat then to type until var while • Also potential variable and method names • Begin with letter and then any combination of letters and numbers • DONOT worry (yet) if it is an actual name
Other Work While Scanning • Comments • Pascal also includes comments • Begins with either a “{“ or “(*” • Then include any legal characters including letters, numbers, spaces, newlines • End with either “}” or “*)” • { This is a legal comment *) (* and so is this } • There is no comment token --- it is not used in compilation