1.38k likes | 1.79k Views
Lecture 2. Basic Java Syntax. Cheng-Chia Chen. Contents. Java Program Structure The lexical structure of Java PL. Java Variables and Data Types Array Java Operators and Expressions Java Statements. Java Program Structure. A program is made up of one or more classes
E N D
Lecture 2. Basic Java Syntax Cheng-Chia Chen
Contents • Java Program Structure • The lexical structure of Java PL. • Java Variables and Data Types • Array • Java Operators and Expressions • Java Statements
Java Program Structure • A program is made up of one or more classes • A class contains one or more methods and fields • A method contains program statements • A Java application always executes the mainmethod class Lincoln { public static void main (String[] args) { System.out.println ("Whatever you are, be a good one."); } // method main } // class Lincoln
Java Program Structure class modifier class name // comments about the class public class MyProgram { } class Header class body Comments can be added almost anywhere
Java Program Structure // comments about the class public class MyProgram { } Method modifiers // comments about the method public static void main (String[] args) { } method header method body Method name Return type
2. The lexical structure of a Java program • Multi-layer view of a programming language • Kinds of tokens of Java • Unicode • Lexical Translation process of Java • White Spaces • Comments • Identifiers • Keywords • Literals • Separators • Operators
/* proc1: this is a well-formed C procedure */ void copy_chars() { character ch; while((ch = getchar()) != EOF) putchar(ch)); } This procedure has no error. /* proc2: this is an ill-formed C procedure */ void copy_chars() { character c h; int ; while((ch = getchar()) ! = EOF) putchar(chs)); } This procedure have many errors. What is a well-formed program ? • A program is just a character string • satisfying some constraints. • Fine! but still hard to explain the difference b/t proc 1 & 2.
Traditional explanation • Lexical rules: • “This” : pronoun[s]. • “procedure” noun[s] • “has” :verb[s]; “have” is a verb[m] • “no”, “many” : adj, adj[m]. • “error”, “errors” : noun[s], noun[m] • is a separator • “.” is a special symbol indicating end of a sentence. • Syntax rules: • pronoun[s] noun[s] -> NP[s], adj[x]noun[x] -> NP[x] • verb[s] NP -> VP[s] • NP[s] VP[s] . -> S
Thisprocedurehasnoerror. . . . ‘p’ ‘s’ ‘i’ ‘h’ ‘T’ syntax analysis Syntax rules (grammars) Lexical analysis Lexical rules ‘This’ ‘procedure’ … char stream Word/token stream Note: Not all characters (e.g. ) contribute to tokens Two-Layer view of a [programming] language
Basic issues about the lexical structure of a PL 1. What is a character ? • C => ASCII (0x00~0x7f) • java => Unicode(0x0000~0xffff) 2. What are token delimiters? • Whose only purpose is to delimit tokens and increases readability ? • java =>WhiteSpace, comment 3. How many kinds of tokens are there in a PL ? 4. What character strings belong to which tokens? 5. For java, input element = token + whiteSpace + comment; every input char belongs to one input Element verb verbs nouns advs The set of all char strings
Kinds of tokens in Java • Token: • Identifier • used for the name of package, type, method, (local)variable, parameter,etc. • Keyword (reserved words) • if, then, else, while, do, switch, case, • class, method, public, static, final, abstract,… • int, boolean, double, long, … • Literal • user for constant values. • ‘c’, 123, 123L, true, 0.2e23, ”s String”, … • Separator • { } ( ) [ ] , ; . • Operator: • +, -, *, /, %, <=, >, …
Unicode • java Programs are written using the Unicode character set. • detail about unicode can be found at http://www.unicode.org • Each character is represented by two bytes. (so there are at most 65536 characters that can be represented). • The first 128 chars are identical to ASCII and • the first 256 chars are identical to Extended ASCII (ISO-8859-1) • can have different encodings (representations): UTF8, UTF16, etc… • ASCII and UNICODE have the same UTF8 representations. • Source program can still be written in OS’s native non-ASCII character system like Big5, with the help of preconversion of java compiler. • Unicode characters not directly representable in native character system => using unicode escape : • eg: ‘⊗’ => \u2297 , ‘A’ => \u0041, …
ISO-8859-1 ASCII UNICODE 0x00 ~0x7F ( 1 bytes) 0x00~0xFF (1 byte) 0x0000~0xFFFF [two bytes in UTF16 or UCS2] [1~3 bytes in UTF8] Unicode, ASCII and ISO-8859-1 • ASCII and the first 128 chars of UNICODE have the same UTF8 representations and mapping. • ISO8859-1 and the first codepage of UNICODE have the same mapping but different UTF8 representations • ISO-8859-1 to UNICODE transformation in UTF16: • Simply add one extra 0x00 byte before each ISO-8859-1 character representation.
This step can also be done offline using native2ascii command Non-Unicode java Source program (e.g., Big5 ) preconversion compile Unicode (ASCII) char streams javac
Where Unicode character are used • Where non-ASCII unicode characters are used ? • comments, • content of character literal, • content of string literal, • identifiers. • all other input elements are formed only from ASCII characters • WhiteSpace, • Key Words, • Non-character literals, • operators.
Lexical translation process used in Java PL Eliminate unicode escapes Decompose Into lines Unicode stream containing Unicode escapes Unicode stream • Non-line-terminator chars + • Line terminators (LF,CR) Passing Tokens Token stream Decompose Into Input elements • Input elements: • Comments, • whiteSpaces, • Tokens
White Space • WhiteSpace: • the ASCII SP character, also known as "space“ • the ASCII HT character, also known as "horizontal tab“ • the ASCII FF character, also known as "form feed“ • LineTerminator: CR, LF or CR LF
Comments • There are two kinds of comments: • /* text • multi-line comment • … */ • // text end-of-line comment Notes: 1. Comments do not nest. 2. /* and */ have no special meaning in comments that begin with //. 3. // has no special meaning in comments that begin with /* or /**. Ex: /* this comment /* // /** ends here: */ is a single complete comment.
Identifiers • Used for • class name, method name, variables etc. • no predefined meaning except as specified by the programmer • made up of letters, digits, the underscore character (_), and the dollar sign • They cannot begin with a digit • Java is case sensitive, • Total and total are different identifiers
abstract boolean break byte byvalue case cast catch char class const continue goto if implements import inner instanceof int interface long native new null operator outer package private protected public rest return short static super switch synchronized this throw throws transient true try var void volatile while Reserved Words • The Java reserved words: default do double else extends false final finally float for future generic • Reserved words cannot be used as identifiers
Literals • A literal is the source code representation of a value of a primitive type, the String type, or the null type. • Literal: • IntegerLiteral : 123 • FloatingPointLiteral: 123.4, 1.2e12, 12f • BooleanLiteral: true, false • CharacterLiteral: ‘a’, ‘\u123’, ‘\377’,… • StringLiteral: “a book”, “a tab \t and newline \n” • NullLiteral: null
Integer Literals • An integer literal may be expressed in decimal (base 10), hexadecimal (base 16), or octal (base 8): • IntegerLiteral: • DecimalIntegerLiteral • HexIntegerLiteral • OctalIntegerLiteral • DecimalIntegerLiteral: • 0 • [1-9] [0-9]* [lL]? Ex: 123, 123l, 1243L • HexIntegerLiteral: • 0 [xX] [0-9a-fA-F]+ [lL]? Ex: 0X1aB4, 0x23L, 0xffff • OctalIntegerLiteral: • 0 [0-7]+[lL]? Ex: 000, 0377, 01177L
Floating-Point Literals • FloatingPointLiteral: • Digits . Digits? Exp? FloatType? • . Digits Exp? FloatType? • Digits Exp FloatType? • Digits Exp? FloatType • Exp: [eE] [+-]? Digits • FloatType: [fFdD] • Digits: [0-9]+ • Ex: • 12.34e12d, 1.23E-2, <= cf: 12.34e12d • .11, .12E12f, • 12e-2, 1e12D <= cf: 0x1e12L • 33e2f, 33D Space in the literal not allowed
Boolean Literals • The boolean type has two values, represented by the literals true and false, formed from ASCII letters. • A boolean literal is always of type boolean. • BooleanLiteral: • true • false
Character Literals • expressed as a character or an escape sequence, enclosed in ASCII single quotes. • A character literal is always of type char. • CharacterLiteral: • ' SingleCharacter ' • ' EscapeSequence ' • SingleCharacter: • Any Character but not any of ' \ CR LF • ‘c‘ => compile-time error • It is a compile-time error for a line terminator to appear after the opening ' and before the closing '.
Example: • The following are examples of char literals: • 'a' '%' '\t' '\\' '\'' • '\u03a9' '\uFFFF' '\177' '’ ‘‘ • Which of the following are correct literals: • ‘\u000a’, • ‘\n’, • ‘\u000D’, • ‘\r’
String Literals • consists of zero or more characters enclosed in double quotes. • Each character may be represented by an escape sequence. • always of type String • StringLiteral: • " StringCharactersopt " • StringCharacters: • StringCharacter • StringCharacters StringCharacter • StringCharacter: • Any Character but not any of " \ CR LF • EscapeSequence
Example: • “a b c \u000d sd \u000a” // error!! • “This is a two-line string“ //err! string can not across multilines • “ a b c \r sd \n” // OK • "" // the empty string • "\"" // a string containing " alone • "This is a string" // a string containing 16 chars • "This is a " + // actually a string-valued • “two-line strings” // constant expression, formed // from two string literals
Escape Sequences for Character and String Literals • Escape sequences allow for the representation of some nongraphic characters, the single quote, double quote, and backslash characters in character and string literals. • EscapeSequence: • \ b /* \u0008: backspace BS */ • \ t /* \u0009: horizontal tab HT */ • \ n /* \u000a: linefeed LF */ • \ f /* \u000c: form feed FF */ • \ r /* \u000d: carriage return CR */ • \ " /* \u0022: double quote " */ • \ ' /* \u0027: single quote ' */ • \ \ /* \u005c: backslash \ */ • OctalEscape/* \u0000 to \u00ff: from octal value */ • OctalEscape:// can represent only the first 256 chars • \ OctalDigit// [0-7] • \ OctalDigit OctalDigit// [0-7][0-7] • \ ZeroToThree OctalDigit OctalDigit// [0-3][0-7][0-7]
The Null Literal and separators • The null type has one value, the null reference, represented by the literal null, which is formed from ASCII characters. • A null literal is always of the null type. • NullLiteral: • null Separators • The following nine ASCII characters are the separators (punctuators): • Separator: one of • ( ) { } [ ] ; , .
Operators • The following 37 tokens are the operators, formed from ASCII characters: • Operator: one of • > < == <= >= !=// relational • + - * / %// arithmetic • ! && ||// conditional logical operations • ?:// ternary conditional operation • ++ --// PRE/POST INCR/DECR • & | ^ ~ << >> >>>// Bit and boolean operation • = += -= *= /= &= |=// Assignment • ^= %= <<= >>= >>>=
3. Java Variables and Data Types Data Types supported by Java: • primitive types • numeric types • integer type: byte, char, short, int, long • floating-point type: float, double • boolean • reference types • class • interface • array Notes: • instances of primitive types are not objects; only instances of reference types are objects. • Instances of reference type are pointers to object (or structure) instead of objects.
Integer types • name representation range • byte8-bit 2's complement -128~127 • short16-bit 2's complement -32768~32767 • int32-bit 2's complement -2147483648 to 2147483647 • long64-bit 2's complement -263~263-1(19 digits) • char16-bit Unicode '\u0000' to '\uffff'
floating-point and boolean types • name representation range • float 32-bit, IEEE 754 1.40239846e-45 to 3.40282347e+38 • double 64-bit, IEEE 754 4.94065645841246544e-324 to 1.79769313486231570e+308 Representation: • non-zero value: v = S x M x 2^E • For float: S is +/- 1, M is a positive integer less than 2^24, and E is an integer in the inclusive range -149 to 104. • For double: S is +/- 1, M is a positive integer less than 2^53, and E is an integer in the inclusive range -1045 to 1000. • special values defined in IEEE 754: • +0, -0, +infinity, -Infinity, NaN. • note: 0/0.0 => NaN, • 1/0.0 => Infinity instead of overflow or divide by zero.
Variables and variable Declaration • primitiveType variableName [ = initialValue ]; • Ex: • byte aByte = 127; • short aShort = 32767; • int anInt = 2147483647; • long aLong = 9223372036854775807L; • float aFloat = 3.40282347E+38F; • double aDouble = 1.79769313486231570E+308; • char aChar = 'z'; • boolean aBoolean = true; • Questions: What happen if • short aShort = 32768; // err, assign int to short. • short b = (short) 32768 // ok! but b = -32768. • int aint = 2147483648; // err numberFormException • char aChar = 56; //err, assign int to char • char achar = (char) 56 // ok!
Type conversions • Java allows conversions between values of various numeric types. • Except for boolean, values of all primitive types can be converted. • Basic types of conversions: • Widening conversions: • int double, float double, char int, … • always safe except for int float, double; longdouble. • automated performed by Java • Narrowing conversion: double int, double float,… • must use the cast () operators • not always safe. • Ex: • int i =13; byte b = i; // compiler error • short s = 134 ; // ok though 134 is int type , it is a literal.
Use cast for narrowing conversion • Ex: • int i = 13; • byte b = (byte) i; // force i to be converted to a byte • i = (int) 13.456 // force double literal 13.456 to int 13 • i = (int) –12.6 // i == -12 • Math.round(), Math.floor(), Math.ceil() perform other types of conversion. • short v.s. char: • short s = (short) 0xffff; // s = -1; 2’s comlement • char c = ‘\uffff’; // char like unsigned short • int i1 = s; // i1 = -1 • int i2 = c; // i2 = 65535
Operators and Expressions • Arithmetic operators • +,-,*,/,%, -(unary) • Increment/Decrement operators • ++, -- • String Concatenation Operators • + • Comparison operators • ==, !=, < ,<=, >, >= • Boolean Operators • &&,, ||, !, &, |, ^ • Bitwise and shift operators • ~, &, |, ^ • <<, >>, >>> • Assignment operators • =, +=, -=, *=, /=, %=, • &=, |=, ^=, <<=, >>=, >>>=
Operators and Expressions • The conditional operator • ?: • The instanceof operator • Special operators: • Object member access(.) • Array element access([]) • Method invocation(()) • Object creation(new) • Type conversion or casting(()).
Arithmetic operators and expressions operator type meaning - unary (prefix) unary negation + - binary, binary addition, subtraction * / % binary multiplication, division, modulus (remainder after integer division) ++ -- unary (prefix, postfix) increment, decrement (e.g., a++ is equivalent to a = a + 1) ex: • “total” + 3 + 4 // =“total34” • 7/3, 7/3.0f, 7/0 // = 2, 2.333333f, arithmeticException • 7/0.0, 0.0/0.0 // = Infinity, NaN. • 7 % 3, -7%3, 4.3%2.1 //=1, -1, 0.1. x%y = sign(x) |x| % |y|.
Example: class ArithmeticOperators { public static void main(String args[]) { // Demonstration of arithmetic operators int anInt = 10; System.out.println( anInt++ ); System.out.println( anInt-- ); System.out.println( -anInt ); // We can declare variables at any point! int anotherInt = 3; System.out.println( anInt / anotherInt ); System.out.println( anInt % anotherInt ); } }
comparison (or relational) operators • operator type meaning • > binary greater than • >= binary greater than or equal to • < binary less than • <= binary less than or equal to • == binary equality (i.e., "is equal to") • != binary inequality (i.e., "is not equal to") x == y return true iff 1. same primitive type and same value, or 2. same reference type and refer to same object or array, or 3. different primitive types but equal after conversion to the wider type. Note:1. +0f = -0f; NaN != NaN; NaN != any number 2. <,<=,>,>= apply to numeric types only. • b = true > false ; // error!
Boolean Operators Operator type meaning && binary conditional AND || binary conditioanl OR ! unary logical NOT & binary logical AND | binary locigal OR ^ binary logical XOR 1. &&, || and ! can be applied to boolean values only. => !0, null || true, 1 | 0 // all errors 2. & and | require both operands evaluated; && and || are short-cut versions of | and &.. • a[1]=0; if (a[1] == 1 & a[1]++ == 1) { } // a[1]==1 • a[1]=0; if (a[1] == 1 && a[1]++ == 1) { } // a[1]==0
Bitwise and shift operators • Bitwise operators: ~, &, |, ^ • byte b = ~12; // ~00001100 == 11110011, -13 • 10 & 7 // 00001010 & 00000111 = 00000010 or 2. • 10 | 7 //00001010 | 00000111 = 00001111 or 15. • 10 ^ 7 //00001010 ^ 00000111 = 00001101 or 13. • Shift operators: <<, >>(SSHR), >>> (unsigned SHR) • 10 << 1 // 00001010 << 1 = 00010100 = 20 = 10*2 • 7 << 3 // 00000111 << 3 = 00111000 = 7 * 8 = 56 • -1 << 2 // 0xffffffff << 2 =0xfffffffC = -4 = -1 x 4. • 10 >> 1 // = 10 /2 • 27 >> 3 // = 27/8 = 3. • -50 >> 2 // = -13 = -12 –1 = -50 /4 -1. • -16 >> 2 // = -4 = -16/4. • -50 >>> 2 // = 11001110 (204) >>> 2 = 00110011 = 51.
Assignment operators • operator type meaning • = binary basic assignment • += binary a += 2 is a shortcut for a = a + 2 • -= binary a -= 2 is a shortcut for a = a - 2 • *= binary a *= 2 is a shortcut for a = a * 2 • /= binary a /= 2 is a shortcut for a = a / 2 • %= binary a %= 2 is a shortcut for a = a % 2 • &= binary a &= 2 is a shortcut for a = a & 2 • |= binary a |= 2 is a shortcut for a = a | 2 • ^= binary a ^= 2 is a shortcut for a = a ^ 2 • <<= binary a <<= 2 is a shortcut for a = a << 2 • >>= binary a >>= 2 is a shortcut for a = a >> 2 • >>>= binary a >>>= 2 is a shortcut for a = a >>> 2
The Conditional Operator • syntax: • BooleanExpr ? expr1 : expr2 • Ex: • int max = (x > y) ? x : y; • String name = (name != null)? name : “unknown”;
The instanceof operator • Check if an object (reference) is an instance of the specified type. • syntax: • o instanceof type • Examples: • “string” instanceof String // true • “” instanceof Object // true • new int[ ] {1} instanceof int[] // true • new int[ ] {1} instanceof byte[] // false • new int[ ]{1} instanceof Object // true • null instanceof Object // false • // use instanceof to check if its safe to cast. • if(object instanceof Point) Point p = (Point) object;
4. Array and array Declaration Syntax: • arrayType arrayName[] ( = new arrayType[size] ); • arrayType[] arrayName ( = new arrayType[size] ); • arrayType[] arrayName = {initValue1, initValue2, ... initValueN};
Array Example class ArrayDeclaration { public static void main(String args[]) { // Demonstration of 3 techniques for array declaration; int arrayA[] = new int[10]; int[] arrayB = new int[10]; int[] arrayC = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; //Like C/C++ arrays,Java arrays are indexed from 0 arrayC[3] = 5; System.out.println(arrayC[3]); arrayB[4] = 0; arrayB[4]++; System.out.println(arrayB[4]); System.out.println(arrayB[5]); } }