1.(5) Describe the working process with a database system. a) database definition

1.(5) Describe the working process with a database system. • a) database definition • By the database definition, we create an entity-relationship diagram for a realistic world problem, which is then transformed into a database schema. • b) populating tables with data • The second phase is the database construction, by which a database is initially populated with data. • c) database manipulation • Once a database is constructed, one can conduct some operation on it, such as data insertion, data deletion and data updating, which changes the database from a state to another state. It is the responsibility of a database system to guarantee that the database goes from one correct state to another. ACS-3902 Yangjun Chen

2.(10) Describe the three-schema architecture of a database system and the main functionality of each level in this architecture. • a) internal level • The first level is the internal level, which possesses an internal schema. It describes the physical storage structure of the database. The internal schema uses a physical data model and describes the complete details of data storage and access paths for the database. • b) conceptual level • The second level is the conceptual level, which has a conceptual schema. It describes the structure of the whole database for a community of users. The conceptual level hides the details of physical structures and concentrates on entity types, data types, relationships, user operations, and constraints. A high level data model or an implementation data model can be used at this level. ACS-3902 Yangjun Chen

c) external level The third level is the so-called external level or the view level, which includes a number of external schemas or user views. It describes the part of the database that a particular user group is interested in and hides the rest of the database from that user group. A high level data model or the implementation data model can be used at this level. ACS-3902 Yangjun Chen

3.(10) Explain the following concepts: • (a) superkey • (b) key • (c) candidate key • (d) primary key • (e) foreign key • key constraints • a superkey is any combination of attributes that uniquely • identify a tuple: t1[superkey]  t2[superkey]. • a key is superkey that has a minimal set of attributes ACS-3902 Yangjun Chen

If a relation schema has more than one key, each of them is called a candidate key. • one candidate key is chosen as the primary key (PK) • foreign key (FK) is defined as follows: • i) Consider two relation schemas R1 and R2; • ii) The attributes in FK in R1 have the same domain(s) as the primary key attributes PK in R2; the attributes FK are said to reference or refer to the relation R2; • iii) A value of FK in a tuple t1 of the current state r(R1) either occurs as a value of PK for some tuple t2 in the current state r(R2) or is null. In the former case, we have t1[FK] = t2[PK], and we say that the tuple t1 references or refers to the tuple t2. ACS-3902 Yangjun Chen

4.(15) Draw an ER-diagram to describe the following real world problem. • (a) A university is organized into faculties. • (b) Each faculty has a unique name, ID and number of professors and a specific professor is chosen as the faculty head. • (c) Each faculty provides a number of courses. • (d) Each course has a unique name and courseID. • (e) Each professor has a name, SIN, address, salary, sex and courses taught by him/her. • (f) Each professor belongs to a faculty and can teach several sections of a course. • (g) Each student has a name, ID, SIN, address, GPA, sex, and major. • (h) Each student can choose one faculty as his/her major faculty and take several courses with certain credit hours. Some of the courses are mandatory and some are optional. ACS-3902 Yangjun Chen

ER-model: name salary ID addr. SIN N belong professor 1 NoProf F-Id sex F-name M startdate 1 teach head faculty 1 N M 1 sections sectionId N provide M choose has 1 course N ID name addr. N SIN take M couresId student major mandatory-optional creditHours name sex birthdate ACS-3902 Yangjun Chen

5.(30) Linear Hashing - collision resolution strategy: chaining - split rule: load factor > 0.7 - initially M = 4 (M: size of the primary area) - hash functions: hi(key) = key mod 2i  M (i = 0, 1, 2, …) - bucket capacity = 2 Trace the insertion process of the following keys into a linear hashing file: 24, 7, 10, 5, 14, 4, 1, 8, 2, 3, 17, 13, 15. ACS-3902 Yangjun Chen

24 5 10 14 7 4 • The first phase – phase0 • when inserting the sixth record we would have • but the load factor 6/8= 0.75 > 0.70 and so bucket 0 must be split (using h1 = Key mod 2M): 24 4 5 10 14 7 n=0 before the split (n is the point to the bucket to be split.) 0 1 2 3 n=1 after the split load factor: 6/10=0.6 no split 0 1 2 3 4 ACS-3902 Yangjun Chen

24 5 1 10 14 7 4 24 5 10 14 7 4 insert(1) 0 1 2 3 4 n=1 load factor: 7/10=0.7 no split 0 1 2 3 4 ACS-3902 Yangjun Chen

insert(8) 24 8 5 1 10 14 7 4 24 5 1 10 14 7 4 0 1 2 3 4 n=1 load factor: 8/10=0.8 split using h1. ACS-3902 Yangjun Chen

insert(2) 24 8 24 8 1 1 10 14 10 14 7 7 4 4 5 0 1 2 3 4 5 n=2 load factor: 8/12=0.66 no split 5 0 1 2 3 4 5 ACS-3902 Yangjun Chen

24 8 1 10 2 7 4 5 14 0 1 2 3 4 5 24 8 1 10 14 7 4 5 n=2 load factor: 9/12=0.75 split using h1. overflow 2 ACS-3902 Yangjun Chen

insert(3) 24 8 1 10 2 7 3 4 5 14 24 8 1 10 2 7 4 5 14 n=3 load factor: 10/14=0.714 split using h1. ACS-3902 Yangjun Chen

insert(17) 24 8 1 10 2 3 4 5 14 7 n=4 The second phase – phase1 n = 0; using h1 = Key mod 2M to insert and h2 = Key mod 4M to split. 8 24 1 10 2 3 4 5 14 7 ACS-3902 Yangjun Chen

insert(13) 8 24 1 17 10 2 3 4 5 14 7 n=0 load factor: 11/16=0.687 no split. 8 24 1 17 10 2 3 4 5 14 7 ACS-3902 Yangjun Chen

8 24 1 17 10 2 3 4 5 13 14 7 n=0 load factor: 12/16=0.75 split bucket 0, using h2. 1 17 10 2 3 4 5 13 14 7 8 24 ACS-3902 Yangjun Chen

insert(15) 1 17 10 2 3 4 5 13 14 7 8 24 1 17 10 2 3 4 5 13 14 7 15 8 24 n=1 load factor: 13/18=0.722 split bucket 1, using h2. 1 17 10 2 3 4 5 13 14 7 15 8 24 ACS-3902 Yangjun Chen

b f a b • 6.(15) Given the following B+-tree, trace the deletion sequence: h, c, • e, f. Here, we assume that each internal node can contain at most two • keys and each leaf node can contain at most two value/point pairs. c c e f h Fig. 1 ACS-3902 Yangjun Chen

c a e b e c e f c a b a b e f remove h: remove c: ACS-3902 Yangjun Chen

e a f b b remove e: c a b f remove f: e a a a a ACS-3902 Yangjun Chen

7.(15) Given the relation schemas shown in Fig. 2, construct expressions • (using relational algebraic operations) to evaluate the following query: • Find the names of employees who works on all the projects • controlled by department ‘Business Computing’. EMPLOYEE fname, minit, lname, ssn, bdate, address, sex, salary, superssn, dno DEPARTMENT Dname, dnumber, mgrssn, mgrstartdate Fig. 2 PROJECT WORKS_ON Pname, pnumber, plocation, dnum Essn pno, hours ACS-3902 Yangjun Chen

: SSNS EMP_PNOS DEPT_P DN dnumber(Dname = ‘Business Computing’(Department)) DEPT_P PNUMBER(DNDN.dnumber = PROJECT.dnumber(PROJECT)) EMP_PNOS  ESSN,PNO(WORK_ON) RESULT  FNAME, LNAME(SSNS * EMPLOYEE) ACS-3902 Yangjun Chen

1.(5) Describe the working process with a database system. a) database definition