380 likes | 395 Views
Topics in Database Administration. What is database administration? What are the tasks involved in establishing, creating, implementing and maintaining a database? How is the quality of data addressed? What are the tasks involved in: Database transaction integrity? Database backup/recovery?
E N D
Topics in Database Administration • What is database administration? • What are the tasks involved in establishing, creating, implementing and maintaining a database? • How is the quality of data addressed? • What are the tasks involved in: • Database transaction integrity? • Database backup/recovery? • Database security? • DBMS efficiency?
Tasks in “data” administration • Planning. • Assist in the development of the IS architecture. • Develop enterprise data model. • Assist in database design for application development. • Policies and procedures. • Establish metadata. • Assesses and controls the quality of the data. • Establish policies towards data ownership. • Internal marketing. • Explain the concept of data as a shared resource. • Resolve conflicts between organizational areas.
More detail: Data Resource Management Tasks • Assess organizational data strategy. • Evaluate data level of data integration. • Evaluate quality of data. • Create and maintain metadata. • Create and maintain data models. • Identify stakeholders and ensure ongoing participation in data management. • Evaluate privacy of data. Determine ethical use of data collection and access. • Make effective use of data resources through business intelligence capabilities.
Tasks in “database” administration • Planning. • Guide the selection of hardware and DBMS software. • Choose appropriate tool set for database administration, development and maintenance. • Procedures and policies. • Establish security policies. • Access. • Update. • Establish application development policies. • Use of views. • Use of indexes. • Establish backup and recovery policies. • Enforce policies.
More detail: Technical Management Tasks • Install and update the DBMS. • Create the database(s). • Establish tablespace. • Create users. • Manage data dictionary. • Populate the database(s). • Ensure data integrity. • Triggers and stored procedures. • Referential integrity. • Additional data constructs implemented through SQL. • Perform data backup and recovery. • Monitor and tune performance.
Three of the critical technical tasks of a DBA • Database backup and recovery. • Anticipating the event of a database failure. • Establishing & enforcing database backup & recovery procedures. • Database security and integrity. • Defining security requirements based on data and application needs. • Establishing and enforcing security procedures. • Establishing and enforcing data integrity procedures. • DBMS optimization. • Tracking current response time. • Modifying DBMS parameters to improve response time.
Database backup and recovery • What is backup and recovery? • Backup: A method of storing data from a database in a format that can be used to rebuild the database if necessary. • Recovery: Mechanisms for restoring a database quickly and accurately after loss or damage. • Why have backup and recovery? • Human error. • Hardware failure. • Incorrect or invalid data. • Program errors. • Viruses. • Natural catastrophes.
Backup and recovery are based on transactions • A transaction is one or more database actions (SQL statements) that are treated as a single unit of work. • If the transaction is successful, then the transaction is committed. • If the transaction is not successful, then the transaction is rolled back or aborted. Imagine a database with this structure
Accepting an order for a product INSERT INTO order_tbl VALUES (123,’27-apr-06’, 765, ‘net30’); INSERT INTO orderline_tbl VALUES (123,6812,10, 34.99); UPDATE product_tbl SET qoh = qoh - 10 WHERE prod_no = 6812;
DBMS’s have methods to control transactions • Databases that support transactions provide specific commands for starting, committing, and rolling back transactions. • Begin transaction. • End transaction. • Commit. • Rollback. • Autocommit. • Transaction throughput: The number of transactions processed per time interval. This is a measure of transaction processing performance.
Transactions have properties • Atomicity: A transaction is an indivisible unit of work. • Almost all languages that interface with a DBMS have a way to signify the start and end of a transaction. • Within the start and end are one or more SQL commands. • Consistency: The transaction moves a database from one state of consistency, through inconsistency to another state of consistency. • Isolation: A transaction executes in isolation from other transactions. • This is also referred to as the “serializability” of transactions. • A transaction can affect another transaction, so the transaction must complete as though it is isolated in order to be accurate. • Durability: Once a transaction is committed, its effects on the database are durable, or permanent. No subsequent actions or failures to the database can cause that transaction to be lost.
Transaction boundary decision • Division of work into transactions. • Objectives: • Minimize transaction duration. • Ensure transaction isolation. • Constraint: enforcement of important integrity constraints.
Transaction boundary choices. • One transaction for the entire form? • One transaction for the main form and one transaction for all subform records? • One transaction for the main form and separate transactions for each subform record?
Backup is conducted in 3 processes • Backup: A DBMS software utility provides a way to do a complete, full or incremental backup of the database in a consistent state. • Complete: entire database. • Full: all rows of specified tables. • Incremental: rows that have changed since the last full backup. • Journalize: A DBMS software utility provides an audit trail of changes to the database. • Transaction log: contains all data used to process changes against the database. • Database change log: contains a before-image and an after-image of each row modified by a database transaction. • Checkpoint: A DBMS software utility that periodically suspends all transaction processing and synchronizes files within the database. • Some databases, such as Oracle, do not actually halt processing. They simply write checkpoint information to files.
Recovery methods • A DBMS has a utility to recover the database. Usually referred to as the Recovery Manager. • The method of recovery depends on the type of failure. • Recovery Manager usually has the following options: • Switch: Switches to a replica of the database on a different storage device. • Requires that a mirror image of the database is stored. • Can be expensive. • Assumption is a storage failure, not a failure in transaction integrity, occurred. • Restore/Rerun: Reprocesses the transactions for a given time period against a correct version of the database. • Assumption is that a failure in transaction integrity has occurred. • Can be very time-consuming.
Two common methods of restore/rerun • Backward recovery. Also called “rollback” recovery. Used to undo unwanted changes to the database. • Imagine that the current database is inaccurate. • Before-images are applied to the current database to return it to a prior state of consistency. • Used to back out changes that are unwanted. • Forward recovery. Also called “rollforward” recovery. Used to recover accurate transactions and apply them to the database. • Imagine that the current database is inaccurate. This database must be replaced with a prior, consistent version of the database before forward recovery can begin. • After-images are applied to a past version of the database. Does not require that all transactions are applied - just takes the most recent after-images.
Issues in database backup and recovery • Cost. • Media. • Computer overhead (processor, memory, disk) to create journalizing files, control files, checkpoint files, etc. • Personnel to supervise and tune. • Time. • Can result in regularly scheduled downtime. • Can make the system slower.
Potential problems with shared databases • Concurrency control is the process of managing concurrent operations against a database in order to maintain data integrity. • Potential problems with shared databases are: • Lost Update. • Uncommitted Dependency. “Dirty Read” • Incorrect Summary.
Lost update SR: Seats Remaining
Methods of concurrency control • Scheduler. The scheduler establishes the order in which the operations within concurrent transactions are executed. • The scheduler interleaves the execution of database operations to ensure serializability. • Some schedulers have the ability to analyze transaction content. • Locking: Fundamental tool of concurrency control. • Obtain lock before accessing an item. • Wait if a conflicting lock is held. • Shared lock: conflicts with exclusive locks • Exclusive lock: conflicts with all other kinds of locks • Concurrency control manager maintains the lock table
More about locking • Locking. A lock guarantees exclusive use of a data item to a current transaction. • Locking can be performed programmatically or left to the DBMS. • Granularity of locking depends on the DBMS. • Database level. • Table level. • Page level. • Row level. • Column level. • Can have shared or exclusive locks.
Problem in locking • Deadlock: An impasse that occurs when two or more transactions have locked a common resource and each is waiting for the other to finish.
Database security Database Security: Protection of the data against accidental or intentional loss, destruction, or misuse. Threats to database security include the list below. • Accidental losses attributable to: • Human error. • Software failure. • Hardware failure. • Theft and fraud. • Improper data access: • Loss of privacy (personal data). • Loss of confidentiality (corporate data). • Loss of data integrity. • Loss of availability.
DBMS security features • Views (frequently referred to as subschemas). • Authorization rules. • Controls incorporated in the DBMS. • Restrict access to specific data. • Restrict actions that can be taken. • User-defined procedures. • Trigger an authorization procedure which asks additional identification questions. • Written in a standard programming language or proprietary language. • Encryption. • Authentication schemes. • Biometric devices.
Oracle’s approach to security • Multi-user database systems, such as Oracle, include security features that control how a database is accessed and used. For example, security mechanisms: • Prevent unauthorized database access. • Prevent unauthorized access to schema objects. • Control system resource usage (such as CPU time or disk usage). • Audit user actions. • Associated with each database user is a schema. • A schema is a logical collection of database objects (tables, views, sequences, synonyms, indexes, clusters, procedures, functions, packages, and database links). • By default, each database user creates and has access to all objects in the corresponding schema.
Oracle’s security mechanisms • The Oracle server provides discretionary access control, which is a means of restricting access to information based on privileges. • The appropriate privilege must be assigned to a user in order for that user to access a schema object. • Appropriately privileged users can grant other users privileges at their discretion; for this reason, this type of security is called "discretionary".
DBMS query optimization • DBMS’s are differentiated by their query optimizers. • A query optimizer is a component of a DBMS. • You do not have the choice of how queries are implemented on the physical database, the query optimizer assumes this responsibility. • You can sometimes “help” and thus possibly improve the optimization process.
Query optimizer methods • Rule based. • Looks at syntax. • Parses query and executes in the order written according to the rules pre-established by the person who wrote the query optimizer. • Cost based. • Looks at syntax. • Looks at statistical data about the database. • Parses query and executes based on the written and the information about the current and historical data of the database. • Choose. • Uses the rule based method for tables which have not been used/analyzed in the past. • Uses the cost based method for tables which have been previously analyzed.
Helping the query optimizer • Know which type of optimizer is used by the DBMS. • Order the query in a way that eliminates more rows with the first “where” condition. • Provide comments or suggestions. • Use joins instead of nested or correlated sub-queries. • Use single table conditions instead of joins. • Use queries that perform grouping with aggregate calculations only when really necessary. • Avoid using the “<>” or not equal condition.
Becoming a DBA • Experience. • Application programming with a database. • Some DBA tasks. • Systems programming with a database. • Education. • Training classes with a specific DBMS. • For example, Oracle offers a ten-class sequence for DBA’s. • Master’s degree? • Certification. • Specific to DBMS. • Offered by DBMS vendor. • Formal programs offered for Oracle and SQL Server; informal programs for other DBMS types.