Teaching Aid for Bernstein’s Algorithm

CS4221Database Design Teaching Aid for Bernstein’s Algorithm Group P10 Daphne Tay A0071939U Pan Lu A0070083N Tan Wee Kwan A0074053M Yeo YaiHui A0069830Y

Objective • Teaching aid with interactive user interface • Learn in a simple yet efficient manner.

What is Bernstein’s Algorithm • Invented by Philip A. Bernstein in 1976 • 6-steps synthesis algorithm • Obtain 3NF given a relation with FDs

Bernstein’s Algorithm • Eliminate extraneous attributes • Find covering • Partition • Merge equivalent keys • Eliminate transitive dependencies • Construct Relation

Design • Select relation and construct FDs • Step-by-step synthesis and explanation • Additional Features: • 2NF / 3NF / BCNF Checkers • Lossless Checkers • Candidate Key finder • Superfluous Attribute Finder

Implementation

Step 1 • Check the left side of FDs containing more than 1 attribute. • Remove 1 attribute and find closure of remaining attributes. • If right side of FD can be found in closure, it is extraneous. R(A,B,C,D) ABE->CD A->B B found in closure. It is redundant! AB ->CD A+ = ABCD

Step 2 • Split all right sides of FDs into separate FDs. • Remove a FD X->Y. • Find closure with X. • If Y can be found in closure, FD X->Y is redundant and removed.

Step 3 • Merge all FDs with same left sides.

Step 4 • Get closure of all partitions. • Compare closure for properly equivalence. • If present, create J and obtain H’. • Remaining partitions as it is.

Step 5 • Remove a FD X->Y. • Find closure with X. • If Y can be found in closure, FD X->Y is redundant and removed.

Step 6 • Construct relation based on each H. • Obtain keys based on left side of FD.

Shortcomings • Does not guarantee losslessness • Does not find all the keys • Does not remove all superfluous attributes • Set of 3NF relations produced depends on the type of minimal cover found in Step 2 • BCNF relation may contain superfluous attributes

Additional Features • Losslessness Checker • Superfluous Attribute Checker • BCNF Checker • 2NF / 3NF Checkers • Candidate Key Finder

2NF Checker • Naïve way to verify • Check that a FD is a subset of another FD in both left and right for violation. E.g. AB->CD and A->C

3NF Checker • Check if the relation already in 2NF. • If not, then also not in 3NF. • If it is, check for transitive dependencies for violations.

Candidate Key Finder • Classify the attributes to Left, Middle, Right. • Left are definitely part of the candidate keys. • Create combinations of all Left and Middle to derive all possible super keys through closure. • Filter obtain set of candidate keys.

BCNF Checker • Obtain closure of attribute(s) appear on the right side of FD • If closure contains part of left side of that FD, violation.

Lossless Checker • Gather attributes in relations resulted in step 6 • Check against initial set of attributes given before Step 1 • If equivalent, then it is lossless

Superfluous Attributes Finder • Naïve way to check for superfluous attributes • Check based on definitions above

Demo

Limitations • Limited Number of Attributes • 9 attributes at the tool’s maximum • To facilitate efficient learning • Could be improve by adding more attributes • Increase the complexity of possible FDs created by the users • Naming the attributes • Fixed naming: A- I • Users could not key in their own desired names • Real life examples set by users could make the tool more vivid for learning

Limitations • Design of the tool • Could be make more vibrant and attractive • Tool for age around 20 and above • Interesting user interface could be implemented: e.g. graphically • Minimum Covering • Usually Bernstein’s Step 2 will output different results if different coverings of the same set of relational schema and FDs were used • Our algorithm only output one covering at any random time • Tool could be enhanced by using random seeds to select which FDs to start with while looping

Limitations • Finding all candidate keys using Bernstein’s • NP-complete problem • Not all candidate keys can be found using Bernstein’s • Time to solve the problem increase quickly when size of the problem increases • Using any currently known algorithm • Could reach into billions or trillions of years as size grows • Good side of keeping our maximum attributes to be 9: • Able to find the candidate keys more efficiently at smaller sizes

Conclusion • Teaching Tool is coded in Java and runs with Jar File • Run on any platforms with Java Runtime Environment • Main purpose is to help users to understand Bernstein’s Algorithm • Addressing the limitations will make our tool becoming better

Thank you

Q & A

Teaching Aid for Bernstein’s Algorithm