590 likes | 607 Views
Learn about database security measures, threats and countermeasures, and the management of transactions and concurrency controls.
E N D
Week 14November 28 Database Security Transaction Management and Concurrency Controls Distributed Database Data Warehouse, Data Marts and MMDBMS OODBMS
Database Security • The protection of the database against threats using both technical and administrative controls • Database security aims to minimize losses caused by anticipated events in a cost-effective manner without unduly constraining the users • Threats: • Theft and fraud • Loss of confidentiality • Loss of privacy • Loss of integrity • Loss of availability Organization Policy Controls (objectives for system) Database Security Organizational Resource
Threats • Any situation or event, whether intentional or unintentional, that will adversely affect a system and consequently the organization. • Tangible losses (hardware, software, data) • Intangible losses (credibility, confidentiality) Countermeasures and Contingency Plans
Threats and Countermeasures • Initiate countermeasures to overcome threats • Consider the types of threat and their impact on the organization • Cost-effectiveness • Frequency • Severity
Threats and Countermeasures • Objective is to achieve a balance between a reasonable secure operation, which does not unduly hinder users, and the costs of maintaining it. • Risks are independent of the countermeasures Secured Operations Costs Countermeasures Risks
Countermeasures • Computer-based vs. Non-computer-based Implemented through the operating system and/or DBMS Management policies and procedures
Computer-Based Controls • Computer-based controls • Authorization • Views • Backup (and recovery) • Journaling • Checkpointing • Integrity • Encryption • Associated procedures
Computer-based Control: Authorization or Access Controls • Granting privileges which enables users and applications to legitimately have access to a system or object (table, view, application, procedure, etc.) • Authentication ensures the user is who s/he claims her/himself to be • Layers of access or penetration into a system • Ownership and privileges • Access to database(s) • Manipulation and definition of data
Authorization and Authentication O/S User Operating System DBMS User DBMS Table Database Database Grants Table Objects and Privileges
Computer-based Control: Views • Virtual relation to support a user’s particular needs • Restricts access and actions • Created upon demand of the user Base Relations Virtual relation
Computer-based Control:Tables SQL> grant select, update, delete on comp_products to scott; Grant succeeded. SQL> revoke delete on comp_products from scott; Revoke succeeded. Privilege User name Table GRANT privilege ON table TO user; REVOKE privilege ON table FROM user;
Transaction Management What is a “transaction?” • An action or series of actions, carried out by a single user or application program which reads or updates (changes) the contents of the database • Retrievals • Updates (modifications) • Insertions • Deletions
What is a “transaction?” • Characteristics • Atomicity (entirety of action) • Consistency (from one consistent state to another) • Isolation (independent of other transactions) • Durability (permanence)
Transaction Management • Provide a means for maintaining the integrity of the database • Importance: • In a multi-user environment, the order of transactions actions must be maintained through concurrency control • In the event of a failure or destruction of data, data must be reconstructed through database recover Data Integrity
Concurrency Control • The process of managing simultaneous operations on the database without having them interfere with one another • Potential problems: • Lost update problem (one update overrides another) • Uncommitted dependency problem (intermediate results of one update viewed by another before it has been committed) • Inconsistent analysis problem (data retrieved by one user updated by another before the end of the retrievals) • Nonrepeatable read (retrieval results cannot be repeated)
Concurrency Control • Serializability - scheduling transactions to maximize concurrency and parallelism, yet preventing them from interfering with one another and maintaining consistency • Serial schedule - non-interleaved transactions T1 T2 T3 ... Tn • Nonserial schedule - interleaved transactions T1T3 ... TnT2 T4 T5 T6 ... Tn+1 Scheduler must resolve conflict Conflict
Concurrency Control: Locking and Timestamping • Locking • Prevents simultaneous access or update of the same data • Timestamping • Ordering (prioritizing) transactions by their timestamp
Concurrency Controls: Locking • Locking methods – lock denies other users from accessing the data while user accessing them • Shared vs. exclusive lock • “Deadly embrace” or deadlock – when a user has a lock on one data item and awaits another, and a second user awaits the data item locked by the first user and has a lock on the data item sought by the first Account balance (locked) Credit limit (locked) Credit limit (waiting) Account balance (waiting)
Concurrency Controls: Timestamping • All transactions assigned a timestamp (unique identifier that indicates its relative starting time) • Smaller (older) timestamps are given priority • Conflicts resolved through rollbacks and restarts • Transaction rolled back (to its beginning) and restarted (reassigned a newer timestamp)
Timestamping • Problems • A younger transaction writes a data item before an older transaction accesses it • An older transaction needs to write a data item already accessed by a younger transaction • An older transaction needs to write a data item already written by a younger transaction • Resolved through roll backs and restarts
DDBMS DDBMS DDBMS Global Data Dictionary Global Data Dictionary Global Data Dictionary Local DBMS Local DBMS Local DBMS Database Database Database Site 1 Site 1 Site 1 Distributed Databases • Distributed database:A logically interrelated collection of shared data, physically distributed over a computer network Network Transparency Geographically Distributed Site 1 Site 3 DDBMS – software system that permits the management of the distributed database and makes the distribution transparent to the user. Heterogeneous vs. Homogenous Site 2
DDBMS Architecture Data Communications DDBMS DDBMS Global external schema Global conceptual schema Global Data Dictionary Global Data Dictionary Local DBMS Local DBMS Local external schema Local conceptual schema Local internal schema Database Database Site 1 Site 2
Data Allocation • Centralized • Partitioned (fragmented) • Vertical (by columns) • Horizontal (by rows) • Mixed (by columns and rows) • Complete replication • Selective replication (hybrid) • Combination of partitioning, replication and centralization Distributed
Advantages to Distributing • Reflects organizational (distributed) structure • Improved shareability and local autonomy • Improved availability • Improved reliability • Improved performance • Economics • Modular growth • Integration • Remaining competitive
Disadvantages to Distributing • Complexity • Cost • Security • Integrity control more difficult • Lack of standards • Lack of experience • Database design more complex
Considerations for Fragmenting • Usage • Fragmenting by subsets • Efficiency • Store data where they are used most frequently • Parallelism • Parallel execution of a query (divided into subqueries) simultaneously • Security • Store data away from site that do not require them
Disadvantages to Fragmenting • Performance • Increased retrieval time • Integrity • Difficult to maintain across multiple sites • What happens when two users need to update the same data?
Transparency • Distribution • Users perceive the database as a single logical entity • Fragmentation transparency • Location transparency • Replication transparency • Local mapping transparency • Transaction • All distributed transactions maintain the distributed database’s integrity and consistency • Concurrency transparency • Failure transparency The user should NOT be aware of where the data reside or are allocated
Transparency • Performance • DDBMS must perform as if it were a centralized DBMS • DBMS • Hides the knowledge that the local DBMS may be different (applicable to heterogeneous DDBMS)
Robert Anthony’s Taxonomy of Managerial Information Requirements Aggregate Infrequent Quite old Information Requirements External Future Wide Low StrategicPlanning Management Control Currency Frequency of Use Time Horizon Source Scope Required Accuracy Level of Aggregation Operational Control High Information requirements change between levels of management Internal Detailed Historical Well defined Very frequent Highly current
Robert Anthony’s Taxonomy of Managerial Information Requirements Aggregate Infrequent Quite old Information Requirements External Future Wide Low StrategicPlanning Management Control Currency Frequency of Use Time Horizon Source Scope Required Accuracy Level of Aggregation Transaction-based databases Operational Control • Relational (Oracle, DB2, SQL7 • Hierarchical (IMS) • Network (Image) High Internal Detailed Historical Well defined Very frequent Highly current
Data Warehousing • A subject-oriented, integrated, time-variant, and non-volatile collection of data in support of management’s decision-making process. Ad hoc queries External data Tools Decision-making Time-variant Internal data (within the organization) • Report generators • EIS • OLAP • Data mining Information Summarized data Competitive or Strategic Advantage
Data Warehousing Characteristics • Subject-oriented - Organized around the major business subjects or entities, such as customers, order or products • Integrated - Operational (internal) data and external data are integrated into the data warehouse to provide a single unified database for decision support • Time-variant - Use time stamps to represent historical data. Data warehouses consist of a long series of snapshots, each of which represents operational data captured at a point in time • Nonvolatile - New data are appended, rather than replaced, so that historical data are preserved
Data Warehouse Warehouse Manager External sources Outflow Highly summarized data Inflow Upflow Outflow Query Manager Meta-flow Load Manager Lightly summarized data Meta-data Outflow Detailed data Warehouse Manager End-user tools Downflow Archive/backup data
Data Warehouse Data Mart • A subset of a data warehouse that supports the requirements of a particular department or business function • End-user tools: • Reporting • EIS • OLAP • Data mining Summarized data Relational database Oracle9i Extraction Multi-dimension database Summarized data Oracle Express
Implementation • Build data warehouse first • Build data marts first • Build both in parallel Data Warehouse Architecture Developed and implemented in parallel Data Marts
Multi-dimensional Database (MDDBMS) Products Sales medium (e.g., retail, Internet, mail order) Geographic locations Time is an implied dimension
Multi-dimensional Database (MDDBMS) For example… Computers Printers Products Scanners Retail Mail Cameras Internet Sales medium Oregon Nevada California Geographic locations
Multi-dimensional Database (MDDBMS)Working with Two Dimensions Internet Q1 April ‘95 Electronics ‘96 Q2 Mail Order Audio May Total Revenue Receivers ‘97 Speakers Q3 ‘98 June Speakers Retail Repeated for each quarter CD/DVD ‘99 Repeated for each medium Visual Q4 Entertainment Repeated for each year
Multi-dimensional Database (MDDBMS)Working with Three Dimensions Internet Q1 ‘95 USA Electronics ‘96 N. America Q2 Mail Order Audio Total Revenue Receivers ‘97 Europe Speakers Q3 ‘98 Speakers Retail Aisa CD/DVD ‘99 Visual Q4 Entertainment
Time dimension Retail sales dimension Dimensions Oracle Express
Distribution channels dimension Retail sales dimension
Data Warehousing Configuration: Star Schema Which sales mode is becoming more effective for certain products in particular regions? What products sold well in different regions of the country through e-commerce (list by quarters)? (Sales medium) Dimension Table Dimension Table Dimension Table Fact Table (Product line) (Geographic divisions) Dimension Table Which sales staff produced the highest level of sales for a particular product line in California? What this the growth rate for the past 5 years in retail sales of a particular product line by region? (Sales staff) Time is an implied dimension
OID Method Method Data Method Method OODBMS Foreign key Attribute-3 Attribute-2 Attribute-5 Message Key VS. Entities
Object-Oriented Concepts Methods (function) determine the behavior of the object • Object Identifier • System generated • Unique • Invariant • Independent of attribute values • Invisible to the user OID Method Method Data Message Method Method • External call to the object • Activates a method • Attributes or instance variables • Simple • Complex • Reference
Relational vs. Object-Relational Relational Table Relational View Object Table Built-in Data Types Tables Views Built-in Data Types Abstract Data Types Object Views Object Tables David A. Anstey, 1997
Data Types • Built-in • Character (char, varchar2) • Number (integer, decimal, number) • Date • Raw and long raw • RowID • LOB (CLOB, BLOB)
ADTs (Abstract Data Types) • User-defined data types • Composed of simple or built-in data types • Types: object types and collection (aggregate) types Object type Table ADT Built-in Built-in Built-in
New Data Type: VARRAY • Single dimension arrays of fixed lengths SQL> create or replace type contact_addresses as varray(4) of varchar2(30); 2 / Type created. SQL> create or replace type contact_zip_codes as varray(4) of char(8); 2 / Type created.
Object Types • Three components: • Name - unique identifier of the object • Attributes - describes the object through built-in and abstract data types • Method - dictates the behavior of the object SQL> create type students as object 2 (student_ID char(9), 3 student_information personal_information); 4 / Type created.