250 likes | 349 Views
Database Environments Pfleeger’s overview – Ch 6. CSc 250 – Part 1. Database Environments Pfleeger’s overview – Ch 6. Many differences between OS and database (DB) environments Modern DB systems have comprehensive development and processing, and
E N D
Database EnvironmentsPfleeger’s overview – Ch 6 CSc 250 – Part 1
Database Environments Pfleeger’s overview – Ch 6 Many differences between OS and database (DB) environments Modern DB systems have comprehensive development and processing, and the comprehensive set of services + environment is a “DBMS” Stores, manages and processes data/info based on a data model, Most common one now is relational model (assume that here) DB model specifies: a) representation of data b) operations c) integrities (aka constraints) where operations must preserve existing/defined integrities -- in OS, users can represent + process arbitrary data/info, if desired For our purposes, DBMS is software only – no special-purpose hardware -- think of DBMS as a huge application running on OS Each DBMS user accesses a “schema” – logical description of the data entities and data values to be manipulated - entity occurrences are physically stored in table rows (aka “records” in 3GL) - each row is identified by a unique key value, and a row contains a fixed number of column values (field values)
Database Environments Pfleeger’s overview – Ch 6 Like OS, administration done by DataBase Administration (DBA) For a large DB, this can be a complex job, done by >= 1 administrator -- Administration can be very involved at a large site, just as big a job as OS sys admin DBA responsibility can be divided and categorized, unlike absolute unix root MAC and DAC are supported in situations where they are appropriate: Ex: - must authenticate a client database connection - user U can grant access rights on table T, that is owned by U to other user(s) Several prominent relational DBMS vendors, some platform specific (Access, SQL Server, DB2) and others are platform independent (Oracle, CA, etc.), open: mySQL Will use Oracle as representative, since it is among most comprehensive, and scales up well for huge DBs
Database Environments Pfleeger’s overview – Ch 6 DBMS implements model operations with a query language, specially-suited for data retrieval and manipulation; international standard SQL dialects have evolved (92/99) Here, will only consider elementary SQL queries, with form: SELECT ItemList Column Names or expressions (to be read) FROM TableList Table Names WHERE condition condition is arbitrary combination of boolean- valued expressions that a result row must satisfy When a user creates a table (Table Name a legal DB (not OS) identifier), every row stored will occupy physical storage that is managed by DBMS, not OS (a table is analogous to an OS file containing records) Specially-designed, and application-specific sections of >=1 table(s) can be collected together into a view -- Modeling user’s viewpoints and data access needs best done using views
Database Environments Pfleeger’s overview – Ch 6 A view is just a definition of a combination of data in >=1 table(s) No physical storage associated with a view Normally, views are not updated, just SELECT applied to a given view Given the power and relative simplicity of DB query languages, there are some unique security problems in DB that do not have solutions: * Inference problem – in a DB with sensitive + public data: it might be possible to issue a combination of individually permitted queries whose final result reveals sensitive data * Multilevel DB with differentiated security – the security of a single column value might be different from the security of other column values of the same row or from the same column in other rows In general, security should be implemented for every row/column data value
Database Environments Pfleeger’s overview – Ch 6 Pfleeger (pp. 314-15) Many DBMS security issues similar to OS Physical, Logical and Value integrity Outages, failures should not corrupt DB structure or its data DB queries and accesses must not violate any enabled model constraints Ex: disallow INSERT of a row with a duplicate key value into a table Values entered into a DB must always (except during an actual create/update) obey range + format constraints; almost all values managed by the DB are typed values (Except for special objects like LOBs, images, etc. that might be referenceable from DB, but not managed by the DBMS) Access to items can be specified with very high granularity, unlike OS objects; e.g. simple, rigid unix file permissions apply to whole file Views into tables provide user access flexibility, because any designated DB user or collection of DB users can be provided with customized access to a view Also, table T can be private, but a view VT on T can be made SELECTed by user(s)
Database Environments Pfleeger’s overview – Ch 6 Can audit any significant accesses and operations on a DB, as with an OS and identity of user whose actions are audited is based on DB user authentication (more on DB user authentication in next slides set) Along with flexible, data access, via queries, data value integrity is enforced regardless of the kind of query Define an object type “x” by creating a DB table “X”: in Oracle by: create table XName … command Occurrences of t will be rows generated and stored in T via: SQL INSERT adds a row to T Bulk load DBMS tool used to translate OS file records into T rows (Even during bulk load, the DBMS should enforce constraints/integrities)
Database Environments Pfleeger’s overview – Ch 6 Protection features of the OS for DBMS DBMS ultimately relies on underlying OS for security of I/O Ex: Oracle DBWR process calls OS platform I/O services when making actual DB changes; any OS flaws for I/O, storage become DBMS vulnerabilities Recovery from any kind of failure has always been a functional requirement for DBMSs; unlike OSs, historically, that accepted possible loss of data The unit of OS processing is a process The unit of DB processing is a transaction In general, a transaction “t” has the form: begin_t q1; q2; Each qi is an arbitrary SQL query or appl. statement : qn; end_t Each t has atomic execution: either all qj applied OK or t fails and is recovered
Database Environments Pfleeger’s overview – Ch 6 Pfleeger’s description (pp. 321-323) of t implementation sketches one way (it is not the only way) to recover failed t - There are 2 “phases” associated with processing each t: INTENTION phase – gather and save/store info about what the t does UPDATE phase - t changes eventually applied, whether t fails or not Recovery is designed to be as automatic as possible, GOAL: to guarantee no loss of data updates except for those t that were “early” in their execution Recovery is a defense for the non-availability threat associated with t failure There is also much in DBMS that relates to efficient performance, because in large-scale DBs, there can be many concurrent users, so query response time is another aspect of availability Permanent changes phase – DBMS eventually writes the changes specified by the t to the physical DB, whether t failed or not
DBA functions relevant to security - summary • Setting up a database server • User accounts • DB “objects” • Backup and Recovery
Setting up a Database server • DB Server setup is the responsibility of the DBA team • Crucial to have a good platform (cheap does not pay!!! For performance, security reasons …) • Hardware selection (cluster vs. single host), HA configuration ? • Raw partitions & RAID 0+1 (stripping and mirroring) + Logical Volume manager (SAN?) • GbEthernet or higher, how many interfaces? • A properly tuned database is ONLY CPU bound
Oracle (Ora) - Setting up a Database server • You have to plan for the database and for the instance – can run n>=1 instances simultaneously on same server • Database • Sizing the tablespaces (TS) is VERY important Tablespace is logical space, and each tablespace has n>=1 physical OS files to store objects associated with this tablespace • General versus Dedicated • Data growing rate (TS can be created/dropped online) • Locally managed vs. Dictionary managed • Size and rate/duration of the transactions • redo logs and rollback segments (for the INTENSION phase) • Size of the sorting operations • size of the temporary tablespaces
SQL> CREATE DATABASE DEVDB CHARACTER SET "WE8ISO8859P9“ DATAFILE '/ORADB/dbs1/devdb/system01.dbf' size 250M LOGFILE group 1 ( '/ORADB/dbs1/devdb/log1.dbf' ) size 100M, group 2 ( '/ORADB/dbs2/devdb/log2.dbf' ) size 100M ; @$ORACLE_HOME/rdbms/admin/catalog @$ORACLE_HOME/rdbms/admin/catproc @$ORACLE_HOME/rdbms/admin/catrep Setting up a Database Server • Instance – can have n>=1 instances of DB simultaneously • Dedicated vs. Multi Threaded Server connections • Number of processes • Store procedures (java, pin pl/sql code in memory) • Sort area size • Many more… • Backup policy • Has to be agreed with the users • Several options (exp/imp, hot/cold backups) • But do not assume that all critical services backed up
User accounts • Any user can request an account in the central development and production DB handled by IT • But for physics use, there will be dedicated servers • Accounts have space quota on one or more tablespaces • In this account, the user will create new database applications (tables, indexes, types …) • Users can also see/play with data that is on remote databases from the local account • Performance issues must be taken into account due to network access SQL> create user cms_anode_board identified by yupyup default tablespace data01 temporary tablespace temp01 quota 100M on data01 quota unlimited on indx01; SQL> grant designer to cms_anode_board;
Security(1/2) DBMS architecture has catalog (aka data dictionary “DD”) that is a central repository of “data about the DB data” • The DD (in tablespace SYSTEM) can not be modified directly • For everything that you can do in the database there is an associated database privilege – definitely Least Privilege • A DBA can group system privileges into (named) roles and assign them to user accounts • A user can also create roles to assign access privileges (on his/her objects) to another users • By default, you can not see/modify other user’s data
Security(2/2) • Security can be enhanced in several ways • Data itself can be stored encrypted • Password can expire (like in the normal OS) • Secure the communication with the DB using DES, Triple-DES, RSA RC4... • User authentication with Kerberos, SSL, Cybersafe, Biometrix… • Although the default is everything in clear • We can switch to security mode immediately
From here on: selected topics … • Vast libraries of scripts for many purposes – - tuning - DB admin - security careful about Trojan Horse, etc. possibility • Callouts and externals – - access/reference non-Oracle managed items: sockets, BLOBs, etc • Distributed Oracle - at least the possibility of security issues is here to consider: i.e. n>1 cpu, server interconnect, local + shared mass storage (Not covered in this course: potential distributed DB security issues)
DB Tuning – client(1/2) • The ability to make your queries faster • 90% of the times all the tuning is done at the user code level • Most common tuning cases are due to • Missing index or wrong Query Plan chosen by the Optimizer Tuning can be critical for large DB sites because of the concurrency environment; Hopefully, DBA install of DB used proper configuration parameters
DB Tuning – client(2/2) • You can always see the Query Plan before you run the query • If the optimizer chooses the wrong options you can force a different choice via Hints • Detailed statistics about execution of your queries for more advanced tuning • Can easily identify hotspots • Remedies • Knowing your data & query patterns is essential • Review/test your application carefully before declaring it production Reducing query times from hours to minutes (or even seconds) is not unusual
DB Tuning - server • The database itself has many parameters that can be tweaked to improve the performance • The role of the DBA is to understand the impact of those parameters in the general performance of the db. • In the past, you could not modified them without stopping the database • Now many can be modified online • VERY IMPORTANT: A vast library of tuning scripts (and 3rd party tools) is available
Useful features (1/4) • Partitioning • Data is separated physically according to a key although logically, we see it as single unit • Increased performance, improved data management • Bitmap index • Very useful for low cardinality data • Even for high cardinality if we apply a function that reduces it • Function based index • Index is not created on the data but on the result of applying a function (native or external)
Useful features (2/4) • Server side processing (pl/sql, java) i.e.: DB applications • Closer to data, code reusability • Database can do many more things for you than just storing/retrieving data • Resumable statements • Transaction freezes while the error condition is fixed • User code does not need to be modified to profit • Limited set of errors are trapped in 9i (initial version) • Java VM inside the database kernel • Java compiler to speed up the performance • JSP,Servlets,EJB, Java stored procedures
Useful features (3/4) • External procedures • Extend Oracle function libraries with your own • Using C for the moment • External table api allows any external source (ex a socket) to look like a normal table • XML • XSQL, XSL, XML Parsers for C/C++/Java/PLSQL • XML native data type
Useful features (4/4) • Parallel Query • Query/analyze large amounts of data • Inter parallelism (SMP machines – tested) • Intra parallelism (for clusters - not tested) • Transportable tablespaces • Allows you to move data from one database to another very easily and fast (almost plug&play)