540 likes | 571 Views
CS6302 Database Management Systems. Unit 1-Introduction to DBMS Session 1. File – Collection of logical information/data It contains one or more data/records Example : A student record in a file Name, Roll No, Gender, Department, Year of Joining, Year of pass out, Mode of Admission, etc
E N D
CS6302 Database Management Systems Unit 1-Introduction to DBMSSession 1
File – Collection of logical information/data • It contains one or more data/records • Example : A student record in a file • Name, Roll No, Gender, Department, Year of Joining, Year of pass out, Mode of Admission, etc • Example : An employee record in a file • Employee ID, Name, Designation, Experience, Salary, etc. File Systems Organization
Master Data – does not changes with time • Transaction Data – changes from time to time • Example : Tendulkar, India (Master Data) • Example: Innings Played, Not Outs, Runs Average (Transaction Data) Classification of data
Computer files contain information in electronic format, that are easy to store and retrieve and manipulate • Manual files store the information in paper while computer files in terms of bits and bytes • The process of looking up for an information/data/ record in a file is called “Search” Computer VS manual files
Four types in which file systems are organised • Sequential • Pointer • Indexed • Direct File systems organization
Simplest method of organization of files • Files/records are stored one after another • Two ways of storing • Pile file method • Sorted file method Sequential organization
Records are stored one after other as inserted • New records are placed at the end of the file • For modification or deletion the record is searched in memory blocks Pile file method
Records are sorted each time when inserted in the system (ascending/descending) • Sorting may be based on primary key or any other column (eg. Roll no) • Record is inserted at the end and then sorted Sorted file method
Simple than other file organizations • Not much effort to store the data • Fast and efficient • Good in case of report generation or calculations • Stored in magnetic tapes that are very cheap • Sorting is the only drawback in this method • Time consuming, high data redundancy advantages
Random access or relative file organization • Records stored in Direct Access Storage Device (DASD) like hard disks • File is viewed as numbered sequence of blocks/records • Blocks/records are taken as key for accessing the desired information • Allows random blocks to be read/written • Often used in accessing larger database • Hash function generates the address in which the data is stored. Direct access file organization
Immediate access of desired records • No sorting effort • Faster update of several files • Useful in online transactions advantages
Data may be accidentally erased or updated • Backup is required • Expensive as data are stored in hard disks • Less efficient to sequential in use of storage space disadvantages
Synthesis of sequential and direct methods • Records are stored randomly on DASD by primary key • Using the primary key records are sorted • For each primary key, an index value is generated and mapped with the record • Data can be accessed sequentially/randomly using index Index file organization
Multiple keys • Access in random and sequence ways • Retrieval of record is faster, if the proper primary key is selected • More flexible since any column can be used as key field • Supports range and partial retrieval of data • Eg : Names starting with “Ar” advantages
Extra cost to maintain index is required • Extra space is required to store the index in the disk • Files have to be restructured to maintain the sequence • When records are deleted, memory space has to be released, if not, performance of the system slows down disadvantages
CS6302 Database Management Systems Unit 1-PURPOSE OF DATABASE SYSTEMSSession 2
Programmers wrote application programs to meet the new needs • So, the system uses more files and programs • File processing system supports traditional or conventional operations • Stores permanent records in various files, needs programs to extract records, add records to the files File processing system
Data redundancy and inconsistency • Difficulty in accessing data • Data isolation • Integrity problems • Atomicity problems • Concurrent-access anomalies • Security problems Disadvantages of file processing system/ purpose of a database system
Same information/record/data is duplicated in several locations • Ex: A student record like address and mobile number is maintained in institute office file, is seen also in the student record maintained in the department of the institute • Leads to higher storage space and access cost • Data inconsistency – a change in the record would reflect only in that corresponding one file, not in the other files. • Ex: Change of address updated in office file, but not in the department file Data redundancy and inconsistency
When a new need arises there are two options • Extract the information manually (or) • Programmer writes the necessary application • Ex: to get the list of names that starts with letter ‘A’ in an institute Difficulty in accessing data
Data is scattered in various files and in different formats • Ex. A statistical report can be of a text file/document file/portable document file • Writing the application programs to retrieve the information from different formats of such files is challenging and difficult task Data isolation
Data stored in the database must satisfy some conditions/constraints/checks • Ex: Bank requires the customers to maintain a minimum balance of Rs.5000 • Developers apply these constraints in the system by adding appropriate code in the application. • When new constraints are added, its difficult to change the programs to apply them Integrity problems
A transaction or transfer should happen entirely or not at all Atomic • Account holder X has Rs.5000 balance and a program is executed to transfer Rs. 1000 to another account holder Y. • The system fails during transaction and Rs.1000 was debited from X’s account but was not credited to Y’s account • Results in inconsistent state of a database. atomicity
Deals with the problem of multiple users accessing the system simultaneously to update the data. • Ex. If there are 40 students registered for a course. Two programs are executed for one admission simultaneously to read the value as 40, and both would write back 41. • Leading to an incorrect increase as 1 instead of 2 Concurrent-access anomalies
Enforcing security constraints is difficult, in order to access data from a database system. • Ex: In an educational institution, a finance officer has to see only the part of the database that contains financial information, and not any other information of other section/department. Security problems
CS6302 Database Management Systems Unit 1-database system terminologies and characteristicsSession 3
Data • facts that are recorded and have an implicit meaning Ex: Name, Address • Database • Collection of related data • Database Management System • Collection of programs enables the users to create and maintain a database • A general purpose software system facilitates the process of defining, constructing and manipulating databases for applications Database system terminologies
Defining the database – specifies the datatype, structure and constraints of the data to be stored • Constructing the database – storing the data on some storage medium controlled by DBMS • Manipulating a database – to retrieve specific data, updating the database to reflect the changes and generate reports Database system terminologies
In a database environment.. • Primary resource database • Secondary resource DBMS and related software • DBA – Database Administrator • responsible for authorizing access to the database • for coordinating and monitoring its use • Acquiring software and hardware resources when needed Database system terminologies
Database designers • for identifying the data to be stored in the DB • Choosing appropriate structures to represent and data • Interact with group of users to understand the requirements and develop a view of the DB • Views are later analysed and integrated with views of other user group Database system terminologies
End Users • Require access to the database for querying, updating and generate reports • Four categories of end users • Casual : occasionally access the database for pulling different information, different time. • Naive or parametric: constantly querying and updating the database using standard queries called canned transactions • Sophisticated: Engineers, scientists, business analysts use the facilities of DBMS to implement their applications to meet their complex requirements • Stand alone: maintain personal databases by using readymade package programs. Ex: Tally Database system terminologies
System Analysts • Determine the requirements of the end users and develop specification for canned transaction to meet the needs • Target audience are : Naive & Parametric end users • Application Programmers • Implement specifications are programs, test, debug, document and maintain the canned transactions • System analysts and application programmers are called as software engineers! Database system terminologies
Self describing nature of a database system • Insulation between Programs and Data, and Data Abstraction • Support of Multiple Views of the Data • Sharing of Data and Multiuser Transaction Processing Database characteristics
Self Describing nature of a database • Catalog: Information like structure of the file, type and storage format of each data and constraints on the data are stored • Metadata: Information stored in the catalog Database characteristics
Insulation between Programs and Data, and Data Abstraction • Program-Data Independence: The property where the structure of the data file is stored in DBMS catalog separately from the access programs • Ex: explain with a C file concept • Operation: called as function, has two parts • Interface: Operation name and its datatypes • Implementation: Specified separately, can be changed without affecting the interface Database characteristics
Program operation independence: Programs can operate on data by using operations, without minding how they are implemented • Data abstraction: Feature that allows both program data independence and program operation independence. • Conceptual Representation: Does not include how the data is stored or how the operations are implemented • Data model: type of data abstraction that is used to provide conceptual representation • Uses logical concepts and their interrelationships to understand the storage concepts but hides storage and implementation details! Database characteristics
Support of Multiple Views of the Data • View: Subset of the database or virtual data • Ex: A user interested in viewing the qualification details of the student before joining the course • Another user interested in viewing the performance of the student after joining the course • Both the set of data are stored in one single database Database characteristics
Sharing of Data and Multiuser Transaction Processing • Concurrency Control: to ensure multiple users trying to update same data such that result is correct one • Such facilities are used in online transaction processing (OLTP) applications Database characteristics
CS6302 Database Management Systems Unit 1 - Data Models – Types of Data modelsSession 4
Major purpose of a database system is to provide users an abstract view of the data • For a system to be usable, it must retrieve data efficiently • Developers hide the complexity from users through several levels of abstractions, in order to simplify the users’ interactions with the system View of data
View of data VIEW LEVEL VIEW 1 VIEW 2 .................. VIEW N LOGICAL LEVEL PHYSICAL LEVEL
PHYSICAL LEVEL • Internal level • Lowest level of abstraction • Physical representation of a data • Deals with how a data is stored • Describes complex low level data structure in detail • Blocks of consecutive storage location hidden from developers • DBA may be aware about certain details View of data
LOGICAL LEVEL • Conceptual level • What data are stored in a database and its relationship between them • DBA decides what information has to be stored in the database View of data
VIEW LEVEL • External level • Highest level of abstraction • Describes part of the entire database • Variety of information is present in database • Provides many views for the same database View of data
INSTANCES • Collection of information stored in a database at a particular time • SCHEMA • Overall design of a database, changed rarely • Partitioned according to levels of abstraction • Physical schema : describes database design at physical level • Logical schema : describes database design at logical level • Sub schemas : database description at view level View of data
PHYSICAL DATA INDEPENDENCE • Application programs do not depend on physical schema and need not be rewritten if the physical schema changes View of data
A collection of conceptual tools for describing data, data relationships, data semantics, and consistency constraints • A way to describe the design of a database at physical level, logical level and view level Data models
Relational Model • Entity Relationship Model • Object Based Data Model • Semistructured Data Model • Network Model • Hierarchical Model Data models categories