540 likes | 685 Views
An Introduction to Database and Database Designing. R C Goyal Principal Scientist IASRI, New Delhi. Databases Before the Use of Computers. Data kept in books, ledgers, card files, folders, and file cabinets Long response time Labor-intensive Often incomplete or inaccurate.
E N D
An Introduction to Database and Database Designing R C Goyal Principal Scientist IASRI, New Delhi
Databases Before the Use of Computers • Data kept in books, ledgers, card files, folders, and file cabinets • Long response time • Labor-intensive • Often incomplete or inaccurate
Data & Information Dataknown facts that can be recorded and that has implicit meaning Informationprocessed data Meaningless data becomes information when it is processed and presented to the decision maker in a meaningful way
Database • A set of data that is required for a specific purpose or is fundamental to a system, project, enterprise, or business. A formally structured collection of data. • A database may consist of one or more data banks and be geographically distributed among several repositories. • In automated information systems, the database is manipulated using a database management system.
File Mgmt vs. Database Mgmt • File Management • Each data entity is in a separate file • system for creating, retrieving and manipulating files. • Database Management • Same file but data elements are integrated and shared among different files • Program controls the structure of a database and access to data
Database Management for Strategic Advantage • We live in the Information Age • Information used to make organizations more productive and competitive • Databases used to support business operations • Databases used across a range of applictions • Personal, department, enterprise, web
Technical Advantages of Database Management • Reduced data redundancy • Improved data integrity • More program independence • Increased user productivity • Increased security
Disadvantages of DBMS • Cost issues • Implementation & maintenance issues • Security issues • Privacy issues
Features of DBMS • Database Engine • heart of DBMS • stores, retrieves and updates data • enforces business rules • Data dictionary • holds definitions of all of the data tables • Describes the data type, allows DBMS to keep track of data, helps user find data they need
Features of DBMS…Cont. • Query processor • Enables users to store and retrieve data • Use English command that use such words as SELECT, DELETE, MODIFY • Report generator • Formats and prints reports after user uses query processor
Features of DBMS…Cont. • Forms generator • Application generator • Access security • Setup access privileges to protect data from unauthorized access and sabotage • System recovery
Database Development Cycle • Database planning • System Definition • Requirements collection and analysis • Database design • DBMSselection • Application design • Prototyping • Implementation • Data Conversion and loading • Testing • Operational Maintenance
Database Planning Systems Definition Requirements Collection and analysis Database Design Application Design DBMS Selection Implementation Data Conversion and loading Testing Evaluation & Maintenance Prototyping Database Life Cycle Source: http://www.cs/ucf.edu/courses/cgs2545/CH02/index.htm
Current systems evaluation Development of standards Technological feasibility Operational feasibility Economical feasibility Database Planning
Systems definition Requirements Collection and Analysis Metadata Data dictionary • identifying management information requirements, • determining information requirements by functional area, • and establishing hardware and software requirements
Database Design Conceptual design Logical design Physical design
Database design methodology • A structured approach that uses procedures, techniques, tools, and documentation aids to support and facilitate the process of design. • Conceptual database design • Logical database design • Physical database design
Critical Success Factors in Database Design • Work interactively with the users as much as possible. • Follow a structured methodology throughout the data modelling process. • Incorporate structural and integrity considerations into the data models. • Combine conceptualisation, normalisation, and transaction validation techniques into the data modelling methodology.
Design Considerations - Tips • Planning • Tables • Queries • Forms • Reports • Macros • Modules • General
Tips - Planning • Before you start up the computer to build a database, PLAN ON PAPER!!! • Gather all of the paper forms you use to collect your data • Interview the people who will be using the database, from those who conduct the field work to those who analyze the data • Think about what you want to get back out of your database, not just about how to get data in • Steal - see if it’s been done before
Tips - Tables • Normalize, normalize, normalize • Use field properties to help maintain data integrity • When possible limit the possible entries in a field to a lookup list (domain) • Choose your key fields carefully • Index fields that you will search often • Many-to-many relationships in Access • Steal - link to existing tables
Normalization • The process of breaking down large tables into smaller ones by removing all unnecessary or duplicate fields, eliminating redundant data, and making sure that each table represents only one thing.
Why Normalize? • If you don’t normalize, you may run into anomalies (unexpected results) • Deletion anomaly - deleting a record unexpectedly removes a value we wanted to keep • Insertion anomaly - we can’t add a record because we don’t know the value of all of the required fields • Change anomaly - one change must be applied to many records in a single table
Non-normalized Table Typical flat-file table • Contains all data in one record. • Contains repeating fields. • Contains information on more than one thing.
First Normal Form • Eliminate repeating fields within tables. • Create a separate table for each set of related data. • Identify each set of related data with a primary key.
First Normal Form • No Repeating Fields: Class# is now a single field instead of repeating for each class a student takes. • Now multiple records for each student because >1 record is needed to accommodate all classes for each student Non-normalized Table
Second Normal Form • Create separate tables for values that apply to multiple records. • Relate these tables with a foreign key.
Classes: Second Normal Form Students: Eliminate Redundant Data: no longer repeating student name, advisor, and adv-room for each class
Third Normal Form • Eliminate fields that do not depend on the key. • AdvRoom is independent of the key field (Student#). • It does not depend on the student, it depends on the advisor. Students:
Third Normal Form Students: • Eliminate Data Not Dependent on Key: remove AdvName from the Students table and create a separate table describing the advisors with its own primary key (AdvID). • Relate Advisors to Students using AdvID. Advisors:
Deletion Anomalies • If we delete student Rajat, we lose the only records for advisor Goyal
Insertion Anomalies • If we want to add a new student we have to know what class they are in (it’s part of the key)
Change Anomalies • If student Rajat switches to advisor Gupta, we have to change three records instead of just one
Field Properties • Data Type - text, integer, double, boolean, etc. • Field Size - for text fields • Input Mask - a pattern for entering data • Default Values - auto-entered for new records • Validation Rule - limits values entered • Required? - force entry of data
Relationships • One to One • One to Many • Many to One • Many to Many
Many-to-Many Relationship • Exist between two tables when: • for one record in the first table, there can be many corresponding records in the second table and… • for one record in the second table, there can be many corresponding records in the first table
Students Classes ¥ StudentID ClassNumber ¥ Name StudentID Address Subject City InstructorID State Days Zip Time WorkPhone Comments HomePhone Many-to-Many Relationship One student can take many classes, and one class can be taken by many students.
Tips - Queries • Take advantage of action queries to handle batch record operations • Use queries to present calculated values rather than storing the calculated values in your tables • Remember that null never equals another null • Joining two tables on a field that may contain a null value may not give you the results you expect • Searching for duplicate values will not return two records that have a null • Realize that you can link a table to itself
DBMS Selection • Costs • Features and Tools • Underlying model • Portability • DBMS hardware requirements • Organisational requirements
Implementation • The physical realisation of the database and application designs • the detailed model is converted to the appropriate implementation model, the data dictionary is built, the database is populated, application programs are developed and users are trained
Data Conversion and Loading & Testing • Transferring any existing or new data into the new database and converting any existing applications to run on the new database • Finding errors
Database Evaluation • Interviewing and polling users to determine whether any data needs are unmet.
Operational maintenance • preventive maintenance (backup) • corrective maintenance (recovery)1 • adaptive maintenance • regular monitoring & periodical check up
Recent Developments Affecting Database Design and Use • Data Mining (On-Line Analytical Processing) • Drill down from summary data to detailed data • Data Warehouses/Data Marts • Integrates many large databases into one repository • Linking Web Site Applications to Organizational Databases • Users have Web view to organizational database • Improves customer contact and service • Adds security as a concern
Data Warehouses • Data warehouses are used to organize information for quick and effective queries
Data Warehouses and Database • In the data warehouse, data are organized around major subjects • Data in the warehouse are stored as summarized rather than detailed raw data • Data in the data warehouse cover a much longer time frame than in a traditional transaction-oriented database • Data warehouses are organized for fast queries • Data warehouses are usually optimized for answering complex queries, known as OLAP
Data Warehouses and Database • Data warehouses allow for easy access via data-mining software called software • Data warehouses include multiple databases that have been processed so that data are uniformly defined, containing what is referred to as “clean” data • Data warehouses usually contain data from outside sources
Data Mining • Statistical analysis • Decision trees • Neural networks • Fuzzy logic • Clustering
Data Mining Patterns • Data mining patterns that decision makers try to identify include • Associations, patterns that occur together • Sequences, patterns of actions that take place over a period of time • Clustering, patterns that develop among groups of people • Trends, the patterns that are noticed over a period of time
Web Based Databases and XML • Web-based databases are used for sharing data • Extensible markup language (XML) is used to define data used primarily for business data exchange over the Web • An XML document contains only data and the nature of the data • Continue………………………………..