760 likes | 842 Views
Chapter 12:. Database Administration. (With Modifications). VP Finance. •••••. VP Marketing. VP Production. Accounting. EDP Depart. Once upon a time, the typical IS Organization appeared as:. CEO. Why ???. It made perfect sense:.
E N D
Chapter 12: Database Administration (With Modifications)
VP Finance ••••• VP Marketing VP Production Accounting EDP Depart. Once upon a time, the typical IS Organization appeared as: CEO Why ??? • It made perfect sense: • Information Systems were applied where they were most needed • Accounting Systems • Other standardized, routine applications
VP Finance CIO VP Marketing Systems Development VP Production Database Administration End User Services As information became used for more purposes and across more functions, the IS Organization changed: CEO • Information Systems were applied everywhere • Information Systems were recognized as an Organizational Resource
I am the boss!! Without me, You’re Nothing!! Die, you Egomaniacs!! Basic Definitions: • Data Administrator (DA) • A high-level function that is responsible for the overall management of data resources in an organization: • May be the CIO • Database Administrator (DBA) • A technical function that is responsible for the physical database design and such issues as security enforcement and database performance • Database Steward • A administrative function that is responsible for assuring that organizational applications meet the enterprise goals
Data Administration Functions: • Data Policies • Explicit statement of goals, objectives, and targets • Goal: To Support Cost-Effective Use of the computer environment • Objective: To improve sharing of information across organizational units • Target: Linking of all departmental databases within 2 years • Data Procedures • Written Statement of actions to be taken for a certain activity • “In the event of a database failure, the DBA will: • • • • • Data Standards • Explicit statement of conventions to be followed in data usage • “All table names will be prefaced by their physical location” • “All fields containing age, weight, …. Will contain the data type short”
Data Administration Functions: • Planning • Development of the Organization’s IT Strategy • Must correspond to the Organization’s Business Strategy • E.g., Consider the Difference between UTEP and Harvard • Development of the enterprise model • Top-Down versus Bottom-Up Viewpoint • Development of cost/benefit model • Targets must be measurable • Design of the database environment • Centralized, distributed, Decentralized?? How?? • Develop the data administration plan • A lower-level plan for database implementation, maintenance and growth
Data Administration Functions: • Data Analysis • Define and model data requirements • Define and model business rules • Define operational requirements • Maintain corporate data dictionary • Data Conflict Resolution • Who owns the data? • The department, the business subunit, the corporation? • NOT a trivial question. • Procedures MUST be established in advance
The relationship is like a 3-legged stool: If any leg breaks, the stool collapses Data Administration Functions: • Internal Marketing • Information Systems are political entities • The DA must sell their arguments • Recall the Systems Trinity: • The Manager: The person in charge of the functional department • The System Developer: The person developing the system • The User: The person who will use the system • Recall why systems fail: • Lack of Top management support • Lack of user Acceptance • Bad system Design • It is the DA’s job to make sure that ALL stakeholders are happy
Data Administration Functions: • Managing the Data Repository • Used by the DA to manage the information-processing environment • Contain metadata that describes the organization’s data and data processing resources • Replacing Data Dictionaries (simple data-element documentation tools) • Provides information about: • What users must know what • What automated CASE tools that are used to specify and develop information systems • All Applications that access and manipulate data • DBMS that maintain the repository and update system privileges, passwords, and other information
Database Administration Functions: • Selection of Hardware and Software • Difficult to keep abreast of current technology • Difficult to predict future changes • Emphasis on established off-the-shelf products • Managing Data Security and Privacy • Firewalls • Establishment of user privileges • Complicated by use of distributed systems • Managing Data Integrity • Data consistency • Maintaining data relationships
Database Administration Functions: • Database Backup • We must assume that a database will eventually fail • Establishment of procedures • How often should the data be back-up? • What data should be backed-up more frequently? • Who is responsible for the back-ups? • Database Recovery • Application of proven strategies for reinstallation of database after crash
Shared Administration Activities: • Database Design • DA: Logical Design • DBA: • External Model Design (Subschemas) • Physical Design/Construction • Design Integrity Controls • Database Implementation • DBA: • Establish Security Controls • Supervise Database Loading • Specify Test Procedures • Develop Programming Standards • Establish Back-up/Recovery Procedures • Both: • Specify Access Policies • USER TRAINING
Shared Administration Activities: • Operations and maintenance • DBA: • Monitor database performance • Tune and reorganize databases as needed • Enforce standards and procedures • Both: Support Users • Growth and Change • Both: • Implement Change-Control Procedures • Plan for growth and change • Evaluate new technologies
Data Warehouse Administration: • New function due to increased use of data warehousing • (Massively) Integrated decision support databases from various sources • Emphasis on integration and coordination of data and metadata from multiple databases • Specific Functions 1. Support decision-oriented applications 2. Manage data warehouse (exponential) growth 2. Establish service level agreements
Data Dictionaries and Repositories: • Data Dictionary • Documents data and metadata elements of a database • Systems Catalog • System-generated database that describes all database objects • Information Repository • Stores metadata describing data and data processing resources • Information Repository Dictionary System (IRDS) • A software tool managing and controlling access to the Information Repository
Data Dictionaries and Repositories: • Components of the repository system architecture A schema of the repository information Software that manages the repository objects Where repository objects are stored
Database performance tuning: • DBMS Installation • Setting installation parameters • Memory Usage • Setting cache-levels • Choosing background processes • Input/Output Contention • Deciding who gets what and when • How to distribute heavily accessed files • CPU usage • Monitoring of CPU loads • Application Tuning • Modification of SQL code in applications
Possible locations of data security threats Database Security: • Protection of data against accidental or intentional loss, destruction, or misuse • Increased difficulty due to internet access and client-server technologies
Threats to Data Security: • Accidental Losses • Human Error • Software Failure • Hardware Failure • Theft and Fraud • Establishment of firewalls • Monitoring of activities • Be careful of ‘disgruntled’ employees • Improper data access • Loss of Privacy (Personal data) • Loss of Confidentiality (Corporate data)
Threats to Data Security: • Loss of data integrity • Data may be compromised due to database crashes • Improper recovery can be costly • Loss of Availability • Through Sabotage/Data Misplacement • Viruses/Worms
Managing Data Security: • Data Integrity Controls: • Default Values Entered • Minimization of user data entry • Domain Restrictions • Only certain values can be entered • Probability Checks • Echoing of input to user for confirmation • Self-checking routines • E.g., Check-digits
Managing Data Security: • Views and Subschemas: • Views are not only useful, but can also restrict user access to data • Recall our Physician/Patient Database View: CREATE VIEW drugs_given AS SELECT physname, patient.name, illness.name, prescription.drugcode FROM physician, patient, treatment, illness, prescription WHERE physician.physid = patient.physid AND patient.patid = treatment.patid AND treatment.illcode = illness.illcode AND treatment.drugcode = prescription.drugcode ORDER BY physname; • The user might be restricted from using the view • The user might be restricted from seeing the view’s code (And hence seeing the physical relationships)
Authorization Matrix SQL Privileges Subject Tables Object Tables Managing Data Security: • Authorization Rules: • Rules to Restrict Access
Managing Data Security: • Statistical Databases: The Conceptual Model • Only the datasets with common attributes and their statistics are made available • No data manipulation language is allowed to merge and intersect populations 2. Query Restriction • Query-set Size controls (large only) • Number of over-lapping entities among successive queries • Auditing User Queries • Clustering individual entities in mutually exclusive subsets
Managing Data Security: • Statistical Databases: 3. Output Perturbation • Queries made on actual data • Output ‘perturbed’ so that statistical characteristics remain but individual data is ‘non-sensical’ 4. Data Perturbation • The entire database is first ‘perturbed’ • All statistical relations are maintained in the perturbed dataset • User allowed to make all queries on the perturbed data set (individual data entities show no relationship to the real data)
Managing Data Security: • Authentication Schemes: • Problem: Passwords are flawed • Users Share them • Sometimes easy to determine • User write them down and they get copied • Automatic logon scripts make it unnecessary to enter them manually • Unencrypted passwords travel the internet • Goal: Verify User Identity
Managing Data Security: • Authentication Schemes: • Potential Solutions: • Randomly Assigned Passwords • Forced Password Changes • Secondary Passwords • Biometric Devices • Thumbprint • Hand Geometry • Retinal Scan • Voice Recognition • Facial Recognition • Future: • Body Odor • Multi-attribute
Managing Data Security: • Encryption (“The Second Oldest Profession”): • The earliest recorded use of cryptography is 1900 BC in Egypt. • The scribes who sketched the hieroglyphs telling the story of the life of Khnumhotep II in the town of Menet Khufu used a substitution cipher to encrypt the names and titles of individuals in the story.
“Now is the time for all good people to come to the aid…” “KLT FP QEB QFJB CLO XII DLLA MBLMIB QI ZLJB QI QEB XFA …” Managing Data Security: • Encryption (“The Second Oldest Profession”): • Substitution Ciphers • The Original symbols are substituted for other symbols • Plain Text: ABCDEFGHIJKLMNOPQRSTUVWXYZ Cipher Text: XYZABCDEFGHIJKLMNOPQRSTUVW
Phil Zimmerman Managing Data Security: • Encryption: • Public/Private Keys • Pretty Good Privacy (PGP) • Should the Government have the right to a “Master Key”? • Target of 3-year investigation that he violated export laws
Database Recovery: • Mechanisms for restoring a database quickly and accurately after loss of damage • Recovery Facilities/Components: 1. Back-up Facilities • Periodic back-up copies of the entire database 2. Journalizing Facilities • To maintain audit trails of transactions and logs of database changes 3. Checkpoint Facilities • When the DBMS temporarily halts all activities and synchronizes all files and journals 4. Recovery Manager • A DBMS component that restores the database to a correct condition and restarts processing activities
Current Database Transaction Log DB Change Log Database Backup Database Recovery: • Ongoing Facilities: Backup Facility: Automatic periodic duplication of entire Database • Before and after images of records that have been changed DBMS Journalizing Facility: Logging of Transactions and Database Changes • Logging of every transaction along with timestamps
DBMS Current Database Transaction Log DB Change Log Database Backup Database Recovery: • Periodic/On Demand Facilities: Checkpoint Facility: The processing is stopped and database synchronized Recovery Manager: Upon crash, the database is rebuilt using the Database backup, DB Change log, and Transaction Log
Database Recovery: • Back-up Facilities: • How long between backup (hourly, daily, weekly) is a policy determined by the DA • Frequent back-ups increase reliability BUT each takes some time • Back-ups should be stored off-site • Approaches: • Cold Backup • Database shut down during back-up • More secure BUT transactions delayed • Hot Backup • Selected portion of database is shut down during back-up • Not as disruptive BUT more complicated
Database Recovery: • Journalizing Facilities: • Every transaction is stored to the transaction log as well as the database • Transaction Log • Record of essential data for each transaction processed against the database • Database Change Log • Before-Images of records (before transaction) • After-Images of records (After modification) • Needed for: • Transaction Audits • Database Recovery
Current Database Transaction Log DB Change Log Database Recovery: (Recap) • Journalizing Facilities: DBMS Transaction Effect of transaction added to current database Copy of record affected by transaction stored Copy of transaction stored (In case of database failure) • Before transaction • After transaction
Current Database Transaction Log Database Recovery: • Checkpoint Facilities: • At some specified point in time (by the DA) the DBMS refuses all transactions (The system is in a Quiet state) • The database and the transaction logs are synchronized DBMS Transaction
Database Recovery: • Recovery Manager: • Module of DBMS that restores the database to a ‘correct’ position when a failure occurs Why do databases Fail? • Aborted Transactions • The transaction terminates abnormally due to human error, input of invalid data, loss of transmission, hardware failure, deadlock, etc. • Incorrect Data • Incorrect, but valid, data entered • E.g., incorrect account number, customer payment • System Failure • E.g., Power loss, operator error, systems software failure • The database is NOT damaged • Database Destruction • The database is lost, destroyed, or can not be read • Often due to disk failure
Database Recovery: • Recovery and Restart Procedures • Switch • 2 mirror-image databases maintained • All transactions stored/updated in both databases • Upon failure, the database is ‘switched’ for the mirror image • Generally stored across distributed databases • Fastest/most secure • Expensive • Does not protect against power failures or catastrophes
New Database Transaction Log Database Backup Database Recovery: • Recovery and Restart Procedures • Restore/Run • The previous transactions are reprocessed (up to the point of the failure) against the backup copy of the database • The most recent copy of the database is mounted and the latest transactions rerun • Simple/Cheap • May take considerable time to reprocess • Resequencing errors may occur
DB Change Log (Using only Before Images) New Database Database Backup Current Database Database Recovery: • Recovery and Restart Procedures • Backward Recovery (Rollback) • Unwanted changes are undone through the use of Before images (in the Database Change Log)
DB Change Log (Using only After Images) New Database Database Backup Database Backup Database Recovery: • Recovery and Restart Procedures • Forward Recovery (Rollforward) • After images (in the Database Change Log) are applied to the Database Backup
Database Recovery: What Strategy should be applied? • That depends on the type of failure • Aborted Transactions • Preferred: Rollback • Alternative: Rollforward (To a state just prior to the abort) • Incorrect Data • Preferred: First correct data (if possible) then rollback and rollforward with corrected data • Alternative: Compensating transactions (debit then re-credit) • System Failure (Database intact) • Preferred: Switch • Alternatives: (1) Rollback (2) Restart from Checkpoint • Database Destruction • Preferred: Suicide (unless you can Switch) • Alternatives: (1) Rollforward (2) Reprocess transactions
Transaction Management: • Transaction: • A logical unit of work that must be either entirely completed or aborted • No intermediate states are acceptable. • Most real-world database transactions are formed by two or more database requests. • A database request is the equivalent of a single SQL statement in an application program or transaction • A transaction that changes the contents of the database must alter the database from one consistent database state to another. • To ensure consistency of the database, every transaction must begin with the database in a known consistent state.
Transaction Management: • Transaction Properties: • Atomicity • A transaction is a SINGLE (indivisible), invisible, logical unit of work • A database request and ALL related operations MUST be completed • If ALL requirements are not, the transaction is aborted • Durability • A transaction must be PERMANENT • When a transaction is completed, it has reached (and must remain) in a permanent state • Once in a permanent state, it can not be lost • Even if the database fails, the transaction remains
Transaction Management: • Transaction Properties: • Serializability • Each concurrent transaction is treated as thought they were received and executed in a serial (one after the other) fashion • This is true even in a multi-user or distributed database • If transactions do occur simultaneously, one is assigned precedence over the other • Isolation • Data/Information provided/updated by a transaction can not be used by another (later transaction) until the first transaction is complete (i.e., accepted)
03 01 part 03 02 01 02 part Paper Paper Erasers Erasers Pens Pens descrip descrip 276 1000 onhand 500 1000 475 onhand 276 Table Inventory Table Inventory Transaction Management: • Suppose that we wish to withdraw items from inventory Bye! IF we sell 25 Erasers: 1. Find the part Number 2. Read the number onhand 3. If the number onhand is < 25, ABORT the transaction 4. If the number onhand is >= 25, calculate the new number onhand quantity 500 - 25 = 475 5. Enter (update) the new number onhand quantity (The DBMS will update the Transaction log and Database Change Log)
part 03 02 03 part 02 01 01 Pens Paper Erasers descrip descrip Erasers Paper Pens 1000 276 onhand 276 500 475 onhand 1000 Table Inventory Table Inventory Transaction Management: • The SQL Commands needed are (sort-of) straight-forward: SELECTonhand FROMinventory WHEREpart = 02; UPDATEinventory SETonhand = 475 ; OR Maybe SELECTonhand FROMinventory WHEREdescrip = ‘Erasers’; UPDATEinventory SETonhand = onhand - 25 ; COMMIT;
Transaction Management: Why did you saysort-of ?? • Notice we didn’t check to see if there were 25 Erasers available • If there were not, we could not complete the transaction How do we do that ?? • That is why we are going to learn SQL/PL (Structured QueryLanguage/Programming Language) • Stay Tuned
Transaction Management: • Of course, even simple transactions are sometimes problematic: • Suppose that Dr. Mary Smith (physid: ‘123456789’) Transfers all her patients to Dr. Von Bulow (physid: ‘374659201’) • The command: UPDATEpatient SETpatient.physid = ‘374659201’ WHEREpatient.physid = ‘123456789’; • Will NOT be accepted unless we first enter the command: INSERT INTOpatient VALUES(‘374659201’, ‘Von Bulow, Klaus’, ……);