440 likes | 470 Views
Phase 3. Systems Design. Chapter 6 – Data Design. Phase Description. Systems Design is the third of five phases in the systems development life cycle (SDLC)
E N D
Phase 3. Systems Design Chapter 6 – Data Design
Phase Description • Systems Design is the third of five phases in the systems development life cycle (SDLC) • Now you are ready to begin the physical design of the system that will meet the specifications described in the system requirements document • Systems design tasks include data design, user interface design, and system architecture
Data Design Concepts • Before constructing an information system, a systems analyst must understand basic design concepts, including data structures and the characteristics of file-oriented and database systems, including Web-based database design
Data Design Concepts • Data Structures • A framework for organizing and storing data => consists of one/more files/tables • A file or table contains data about people, places or events that interact with the system • File-oriented system • File processing • Database system Figure 6-3 Figure6-2
Data Design Concepts • File processing • Database system
Data Design Concepts • Overview of File Processing • Although less common today, file processing can be more efficient and cost less than a DBMS in certain situations – E.g. large volumes of structured data such as process records in a sequence Figure6-4
Data Design Concepts • Overview of File Processing • Potential problems • Data redundancy • Data integrity • Rigid data structure
Data Design Concepts • Overview of File Processing • Uses various types of files • Master file – relatively permanent data about an entity • Table file – reference data (e.g. tax table, rate tables) • Transaction file – day-to-day business records. Input file that updates master file. • Work file/scratch file – temporary file for a single task • Security file – for backup and recovery purposes • History file – for historical or archiving purposes
Data Design Concepts • Overview of Database Systems • A properly design database system offers a solution to the problems of file processing A database might serve 5 separate systems in a database environment Figure6-5
Data Design Concepts • Overview of Database Systems • A database management system (DBMS) is a collection of tools, features, and interfaces that enables users to add, update, manage, access, and analyze the contents of a database • The main advantage of a DBMS is that it offers timely, interactive, and flexible data access
Data Design Concepts • Overview of Database Systems • Advantages • Scalability – expandable/modifiable/downsizable • Better support for client/server systems • Economy of scale – efficiency of high-volume processing on larger computers • Flexible data sharing • Enterprise-wide application – database administrator (DBA)
Data Design Concepts • Overview of Database Systems • Advantages • Stronger standards – for data names, formats, documentation, etc • Controlled redundancy • Better security – define authorization procedures • Increased programmer productivity • Data independence
DBMS Components • Interface • DML – controls DB operations (storing, retrieving, updating, deleting) • Schema – Defines the DB (fields, tables, relationships) • Physical Data Repository Figure6-6
Web-Based Database Design • Characteristics of Web-Based Design • In a Web-based design, the Internet serves as the front end, or interface for the database management system. Internet technology provides enormous power and flexibility • Web-based systems are popular because they offer ease of access, cost-effectiveness, and worldwide connectivity Figure6-8
Web-Based Database Design • Connecting a Database to the Web • Database must be connected to the Internet or intranet • Middleware • Macromedia’s ColdFusion Figure6-9
Data Design Terminology • Definitions • Entity • Table or file • Field • Attribute • Common field • Record • Tuple Figure6-11
Data Design Terminology • Key Fields • Primary key – unique field or combination of more than one field (Combination key, Composite key, Concatenated key, Multi-valued key) • Candidate key – any field(s) that could be primary key • Nonkey field – not primary or candidate • Foreign key – a field in one table that matches a primary key. • Secondary key – field that can be used to access/retrieve records. Not unique
Data Design Terminology • Referential Integrity • Validity checks can help avoid data input errors • In a relational database, referential integrity means that a foreign key value cannot be entered in one table unless it matches an existing primary key in another table Figure6-13
Data Relationships • A relationship is a logical link between entities based on how they interact • Entity-Relationship Diagrams • One-to-one relationship (1:1) • One-to-many relationship (1:M) • Many-to-many relationship (M:N)
Data Relationships • Entity-Relationship Diagrams • Cardinality • Cardinality notation • Crow’s foot notation • Unified Modeling Language (UML) Figure6-15 Figure6-14 Figure6-17 Figure6-16
Normalization • Normalization • Table design • Involves four stages: unnormalized design, first normal form, second normal form, and third normal form • Most business-related databases must be designed in third normal form
Normalization • Standard Notation Format • Designing tables is easier if you use a standard notation formatto show a table’s structure, fields, and primary key Example: NAME (FIELD 1, FIELD 2, FIELD 3)
Normalization • Repeating Groups and Unnormalized Design • Repeating group • Often occur in manual documents prepared by users • Unnormalized design
Normalization • Repeating Groups and Unnormalized Design
Normalization • First Normal Form • A table is in first normal form (1NF) if it does not contain a repeating group • To convert, you must expand the table’s primary key to include the primary key of the repeating group
Normalization • Second Normal Form • To understand second normal form (2NF), you must understand the concept of functional dependence • Field X is functionally dependenton field Y if the value of field X depends on the value of field Y • Note: 1NF with a single field primary key is automatically in 2NF
Normalization • Second Normal Form • A standard process exists for converting a table from 1NF to 2NF • Create and name a separate table for each field in the existing primary key • Create a new table for each possible combination of the original primary key fields • Study the three tables and place each field with its appropriate primary key Figure6-21
Normalization • Second Normal Form Figure6-21
Normalization • Third Normal Form • 3NF design avoids redundancy and data integrity problems that still can exist in 2NF designs • A table design is in third normal form (3NF) if it is in 2NF and if no nonkey field is dependent on another nonkey field Figure6-22
Normalization • Third Normal Form • To convert the table to 3NF, you must remove all fields from the 2NF table that depend on another nonkey field and place them in a new table that uses the nonkey field as a primary key Figure6-23 Figure6-24 Figure6-25
Steps in Database Design • Create the initial ERD • Assign all data elements to entities • Create 3NF designs for all tables, taking care to identify all primary, secondary, and foreign keys • Verify all data dictionary entries • After creating your final ERD and normalized table designs, you can transform them into a database
Database Models • Hierarchical and Network Databases • In a hierarchical database, data is organized like a family tree or organization chart, with branches representing parent records and child records • A network databaseresembles a hierarchical design, but provides somewhat more flexibility
Database Models • Relational Databases • The relational model was introduced during the 1970s and became popular because it was flexible and powerful • Because all the tables are linked, a user can request data that meets specific conditions • New entities and attributes can be added at any time without restructuring the entire database Figure6-34 Figure6-35
Database Models • Object-Oriented Databases • Many systems developers are using object-oriented database (OODB) design as a natural extension of the object-oriented analysis process • Object data standard • Object Database Management Group (ODMG) • Each object has a unique object identifier Figure6-36 Figure6-37
Data Storage • Data Warehousing • Data warehouse - dimensions • Without a data warehouse it would be difficult for a user to extract data that spans several information systems and time frames • Allows users to retrieve and analyze the data easily Figure6-38
Data Storage • Data Mining • Works best when you have clear, measurable goals • Increase average pages viewed per session. • Increase number of referred customers • Reduce clicks to close • Increase checkouts per visit • Increase average profit per checkout Figure6-39
Data Storage • Logical and Physical Storage • Logical storage • As seen through a user’s eyes • Characters • Date element or data item • Logical record • Physical storage • Hardware-related • Physical record or block • Blocking factor Figure6-40
Data Storage • Data Storage Formats • Binary digits • Bit • Byte • EBCDIC and ASCII • Unicode Figure6-41
Data Storage • Data Storage Formats • Binary • Binary storage format • Integer format • Long integer format • Other binary formats exist for efficient storage of exceedingly long numbers
Data Storage • Selecting a Data Storage Format • In many cases, a user can select a specific data storage format • For example, when using Microsoft Office, you can store documents, spreadsheets, and databases in Unicode-compatible form by using the font called Arial Unicode MS • Best answer is it depends on the situation
Data Storage • Date Fields • Most date formats now are based on the model established by the International Organization for Standardization (ISO) • Can be sorted easily and used in comparisons • Absolute date • Best method depends on how the specific date will be printed, displayed or used in a calculation Figure6-42
Data Control • File and database control must include all measures necessary to ensure that data storage is correct, complete, and secure • A well-designed DBMS must provide built-in control and security features, including subschemas, passwords, encryption, audit trail files, and backup and recovery procedures to maintain data Figure6-43
Data Control • User ID • Password • Backup • Recovery procedures • Audit log files • Audit fields