300 likes | 318 Views
Geog 357: Data models and DBMS. 1. Geographic Decision Making. Ways of storing digital data. File structures simple ordered sequential indexed Data models Databases hierarchical network relational. File structures. Basic terms record
E N D
Ways of storing digital data • File structures • simple • ordered sequential • indexed • Data models • Databases • hierarchical • network • relational
File structures • Basic terms • record • data items related to a single logical entity (e.g. a student record) (row in a table) • field • a place for a data item in a record (first name field in a student record) (column in a table) • file • a sequence of records of the same type (the table)
File structures A file: “STUDENT” field ID Last First Grade 3 Smith Jane A 1 Wood Bob C 2 Kent Chuck B 4 Boone Dan B record
File structures • Simple list • list of entries in which the order of entry into the list determines the order of the list ID Last First Grade 3 Smith Jane A 1 Wood Bob C 2 Kent Chuck B 4 Boone Dan B
File structures ID Last First Grade • Search of a simple list entails going through each record until search is satisfied (linear search), which is inefficient 3 Smith Jane A 1 Wood Bob C 2 Kent Chuck B 4 Boone Dan B
File structures • Ordered sequential files • list of entries ordered in some way (e.g. numerically or alphabetically) ID Last First Grade 1 Wood Bob C 2 Kent Chuck B 3 Smith Jane A 4 Boone Dan B
File structures • Search of an ordered sequential list can use a binary search method - but only for the ordered field ID Last First Grade 1 Wood Bob C 2 Kent Chuck B 3 Smith Jane A 4 Boone Dan B
File structures • Indexes provide a reference to records based on an index field, which is ordered Last Pointer ID Last First Grade Boone * Kent * Smith * Wood * 1 Wood Bob C 2 Kent Chuck B 3 Smith Jane A 4 Boone Dan B
Data models • A data model is a particular way of conceptually organizing multiple data files in a database • hierarchical • network • relational
Hierarchical data model Parent-child relationship (one-to-one or one-to-many) among data Class Student Instructor Department Grade ID
Hierarchical data model • Advantages • easy to search • can add new branches easily • Disadvantages • must establish the types of search prior to development of the hierarchical structure
Network data model One-to-one, one-to-many, many-to-one, or many-to-many relationships possible Class Student Instructor Department Grade ID
Network data model Advantages flexible, fast, efficient Disadvantages complex restructuring can be difficult because of changing all the pointers
Data models • Hierarchical and network data models have generally been replaced by the relational data model • Relational databases (and their derivatives) dominate the (non-GIS) database market: Oracle, Informix
Databases • A database is a collection of data files that is structured (organized) to facilitate data storage, manipulation, and retrieval. • A database management system (DBMS) is a software package that performs these database functions
? Why Databases?? • Shift from computation to information • Focus on the way to structure information • Datasets increasing in diversity and volume. • Digital libraries, interactive video, e-commerce • ... need for DBMS exploding • DBMS encompasses most of the information technology • OS, languages, theory, multimedia, logic, web
Database - Definition • A very large, integrated collection of data. • A shared collection of logically related data designed to meet the information needs of an organization • Models real-world enterprise • Entities (e.g., students, courses) • Relationships (e.g., Madonna is taking CS564)
Database - Definition • Three key elements of database definition: • Shared • Interrelated • Predefined applications • Side notes: • Database is NOT the real world • Database is an abstraction • Database Information • Data becomes information only when they are used to provide answers to queries
Database Management System (DBMS) • DBMS: A software system that enables users to define, create, and maintain the database and which provides controlled access to this database. • Provide a layer between user application programs and the data • Data Definition Language (DDL) • Data Manipulation Language (DML)
Problems with File-based Systems • Same data is stored in multiple places. Inconsistencies! • We need to write special programs for each user question • Data can be corrupted due to system crash while change is made. • User programs are not easy to share data or evolve.
Advantages of Database Approach • Control of data redundancy • Have a central depository of all data and their descriptions • Same information stored only once • Data Integrity • Controlled access to database • Data independence • Concurrent Access • Crash recovery
Disadvantages of DBMS • Complexity • Cost of DBMS software, hardware and data conversion • Performance • Higher impact of a failure When NOT to use DBMS? • No data sharing • Small scale • Real-time constraints
Roles in the Database Environment • Data Administrator (DA) • Database Administrator (DBA) • Database Designers (Logical and Physical) • Application Programmers • End Users (native and sophisticated)
Summary • Databases are collections of inter-related data. • DBMS used to maintain, query large datasets. • Benefits include recovery from system crashes, concurrent access, quick application development, data integrity and security. • The advantages and disadvantages of DBMSs. • The personnel involved in the DBMS environment • Database management is one of the broadest, most important areas in IST.