330 likes | 340 Views
Databases ( 364-1-1901 ). Session 1: Introduction. צוות הקורס. מרצ ה : ד"ר אדיר אבן דוא"ל: adireven@bgu.ac.il חדר 255. מתרגלים: יובל זק דוא"ל: zaky@post.bgu.ac.il יותם רוטהולץ דוא"ל: yotrot123@gmail.com רחל עבו דוא"ל: rashela@post.bgu.ac.il בודקת תרגילים גל חבר
E N D
Databases(364-1-1901) Session 1: Introduction
צוות הקורס מרצה: ד"ר אדיר אבן • דוא"ל:adireven@bgu.ac.il • חדר 255 • מתרגלים: • יובל זק • דוא"ל: zaky@post.bgu.ac.il • יותם רוטהולץ • דוא"ל: yotrot123@gmail.com • רחל עבו • דוא"ל: rashela@post.bgu.ac.il • בודקת תרגילים • גל חבר • דוא"ל: galhev@post.bgu.ac.il • ניתן לתאם בדוא"ל שעות קבלה עם צוות הקורס • שעות קבלה מתוגברות תפורסמנה לקראת מועדי הגשת הפרויקט והמבחן Databases, Session 1, Introduction
מבנה הציון יש לקרוא בעיון את הסילבוס ואת תיאור הפרוייקט, שהועלו לאתר הקורס במודל • מבחן סוף סמסטר:70% • תאריכים למועדי א' + ב' יפורסמו לקראת תקופת הבחינות • חובת מעבר, בציון 56 לפחות • פרויקט יישומי:30% • חלק א' – חובת הגשה בציון עובר, חלק ב' – 15%, חלק ג' – 15% • צוותים של 3 סטודנטים, הנחיות ההגשה והמועדים יפורסמו באתר הקורס • חובה לקבל ציון עובר בחלקים א' ו-ב' של הפרוייקט, כתנאי להגשת החלק העוקב • צוות הקורס שומר את הזכות לתת ציונים שונים לחברי הצוות, בהתאם לרמת המעורבות והידע • קנס לציון על איחור בהגשה ללא קבלת אישור מראש, כמפורט בסילבוס • תרגילי בית: חובת הגשה הולמת, וקבלת ציון עובר • יש להגיש את 4 תרגילי הבית שיינתנו במהלך הסמסטר • הגשה בקבוצות ע"פ צוותי הפרויקט, הנחיות ותאריכי ההגשה יפורסמו במודל • הגשה הולמת של התרגילים מהווה דרישת קדם לבדיקת הפרויקט • יינתן קנס לציון על איחור בהגשה ללא קבלת אישור מראש, כמפורט בסילבוס חל איסור חד משמעי לשתף מטלות או חלקי מטלות בין צוותים, להעתיק קטעים מעבודות מסמסטרים קודמים, או להיעזר בשירותיהם של גורמים חיצוניים לצורך יישום המטלות! • עבירה על כללים אלו, תוביל למתן ציון 0 על המטלה, והגשת תלונה לוועדת משמעת Databases, Session 1, Introduction
לו"ז הרצאות משוער(יתכנו שינויים, יש לעקוב אחר ההודעות !) מועדי ההגשה למטלות השונות יפורסמו בהמשך באתר הקורס • מועדי המבחנים • (ייתכנו שינויים – יש להתעדכן לקראת המבחן) • מועד א'- 14/10/15 • שיעור החזרה (#13) יתקיים בימים שלפני המבחן. התאריך יפורסם בהמשך • מועד ב' – 29/10/15 Databases, Session 1, Introduction
לימוד כלי תוכנה הקורס דורש שימוש בכלי התוכנה הבאים, ותינתן הדרכה לשימוש בהם: • Power Designer –עיצוב בסיסי נתונים • SQL-Server – יישום וניהול בסיסי נתונים • QlikView – יישום דו"חות וכלים להצגה ולניתוח של נתונים ההדרכה שתינתן היא ברמה בסיסית – הציפיה היא שהסטודנטים ירכשו ידע ויכולות מתקדמות בכלים אלו בלימוד עצמי, תוך העזרות במקורות מתאימים מטלות הקורס תדרושנה שימוש במספר כלי תוכנה נוספים: • Word, Power-Point, Excel, Visio– כתיבת מטלות, שרטוט תרשימים • כלים אופציונאליים, להעשרת חלק ג' של הפרויקט:VBA, Tableau את הידע הנדרש לשימוש בכלים הנוספים יש לרכוש בלימוד עצמי Databases, Session 1, Introduction
Introduction • What is data ? • Why do we need to manage it ? • Some fundamental terms and concepts Let’s start with a simple example… Databases, Session 1, Introduction
Data – Abstracted Representation of Real-World Entities Value Attribute Picture: ? Name: Bashar al-Assad ? Role: President of Syria ? Birth Year: 1965 ? 1.89 m Height: ? Languages: Arabic, English, French ? Spouse: Asma Assad ? Children: 3 ? Education: Medicine, Eye Doctor ? Years in Office: 2000 - Current • We created here a data record (רשומת נתונים) • Abstraction (הפשטה) - we described a real-life entity as a set of attributes (שדות, מאפיינים), and the associated values (ערכים) Databases, Session 1, Introduction
A Data-Item (Datum, ערך, יחידת נתונים) – The Basic Building-Block • First Name: Joe • Last Name: Smith • Born in: Boston • Year: 1967 • Children: 3 Etc. Data-Item {a, e, v}:A value ‘v’ selected from the value domain attached to attribute ‘a’ of entity ‘e’ that represents a real-world object. Databases, Session 1, Introduction
Entities (ישויות) are Represented by Multiple Attributes (מאפיינים, שדות) Physical: • A person: First Name, Last Name, Birth Year, Height,… • A product: Name, Price, Weight, Size,… Conceptual: • A Job: Company, Department, Title, Salary,… • An Academic Title: University, Faculty, Degree, Year,… Databases, Session 1, Introduction
University Employer Lecturer Different Stakeholders May SeekDifferent Data Attributes Tuition Paid Salary Expectations Address Phone Parents’ income Military Service What is a “Student”? Degree Dormitories Use ID Leadership Verbal Skills Name Faculty Courses Writing Skills Grades Software Skills Homework Quality Class Participation Databases, Session 1, Introduction
A Data-Record (Tuple, רשומה) A Data-Record:A collection of data-items that represent a set of attribute of the same entity instance. A record may contain many attributes, with different value-domain types • First Name: Joe • Last Name: Smith • Birth Year: 1967 • Marital Status: Married • Children: 2 Etc. Databases, Session 1, Introduction
A Dataset (קובץ נתונים, טבלה) A Dataset:A collection of data records, each representing an instance of the same entity Tabular datasets – datasets in which all records have the same attribute structure Databases, Session 1, Introduction
Key Attributes, Identifiers (שדות מפתח, מזהים) Each record in a dataset must be unique– no repetitions are permitted In a tabular dataset – the uniqueness is guaranteed by a definition of a primary key Databases, Session 1, Introduction
A Database (בסיס נתונים) Students A Database:A collection of datasets with meaningful relationships between them A database describes a “mini world“– a specific part of the real-world that we want to focus on Faculties Courses Lecturers Databases, Session 1, Introduction
Relationships (קשרים) Foreign Key Primary Key STUDENTS FACULTIES Relationships – objects in a database that define how tables are interlinked In the tabular model – a relationship is represented by a foreign-key attribute that points to a primary key in another table Databases, Session 1, Introduction
An Organizational Database Collection (מערך נתונים ארגוני) An organization will typically have multiple databases, each addressing a specific “mini word” - e.g., a department or a business domain Some data exchange may (or may not) exist between databases Engineering Faculty Accounting Public Relationships Students Registration Databases, Session 1, Introduction
The Data Hierarchy Databases, Session 1, Introduction
Why Do We Care about Data ? • Data is a critical foundationin Information Systems (IS) • Data is the raw material from which information and knowledge are built Data is a critical resource for the modern organization • Running business operations • Re-engineering processes • Analyzing performance • Supporting managerial decisions Databases, Session 1, Introduction
Dataand Computers Does data reside only on computers ? • Not necessarily - a shopping list, A phone book, etc. However, only computerized IT permits: • Fast data collection, transfer, and processing • Cheap storage of large data volume • Efficient data usage, sharing, and exchange 01100010101000111001010100010101001010101… It is our role to organize and structure the data • Computers can only “understand” a sequence of ‘0’ and ‘1’ • Only we can give the 0-1 sequence some meaning • First Name:Joe • Last Name:Smith • Birth Year: 1967 • Marital Status: Married • Children: 2 • Etc. Databases, Session 1, Introduction
Database Management Systems (DBMS) DBMS – a software package (programs, modules) for managing data, which permits • Definition and construction of new databases • Populating them with data • Using the data by multiple users simultaneously • Administration, maintenance and enhancement of databases • “BIG DATA” • No-SQL • Object-DBMS • OLAP Cubes • XML Relational DBMS (RDBMS) 2010’s 1990’s – 2000’s Early DBMS File Processing 1970’s – 1980’s 1960’s 1950’s Databases, Session 1, Introduction
DBMS Generations Early DBMS (Legacy Systems): • VSAM files, Hierarchical, Network // Outdated technologies, we will not discuss them in this course Relational DBMS (RDBMS): • Storing data in a set of inter-related tables • The most common DBMS technology today: Oracle, SQL-Server, My-SQL // The main focus of this course Newer Database Management Technologies and Concepts: • Object DBMS • XML • OLAP Cubes • “Big Data”, No-SQL • … and others // We will review some of these at a high-level towards the end of the course Databases, Session 1, Introduction
The Purpose of Designing a Database • The problem - Data resources tend to be • Complex • Fast-Growing • Disorganized and unusable • Our target is to ensure that the data resources become • Usable • Well organized • Manageable Databases, Session 1, Introduction
The Risk - Data Explosion ? Organizations today collect huge amounts of data • Emerging technologies – E-Commerce, Wireless Communication, Mobile computing, Smartphones ? • Prediction: Worldwide data volumes are expected to exceed the 30Zb mark by 2020 • x44 in 10 years ! • (TucciJ., The CEO of EMC2, 2009) Byte: 8 Binary Bits • Kilobyte (Kb): 103 Bytes • Megabyte (Mb): 106 Bytes • Gigabyte (Gb): 109 Bytes • Terabyte (Tb): 1012Bytes • Petabyte (Pb): 1015Bytes • Exabyte (Eb): 1018 Bytes • Zetabyte (Zb): 1021Bytes The Risk: with no appropriate management, data resources might quickly “explode” and become unusable • Example: Click-Stream Data in a commercial website (e.g., www.amazon.com) Databases, Session 1, Introduction
Structured vs. Unstructured Data Structured data: a collection of data that follows a certain predefined structure Semi-structured data:follows some structure, but not necessarily strictly Unstructured data:no particular structure of records and/or attributes Charts Tables Free-Text Documents Software Objects Pictures Serialized Data Streams … Databases, Session 1, Introduction
The Database Lifecycle We will dedicate the majority of our time to the design phase: • Conceptual design: a model driven bybusiness requirements • Logical design:a model that can interpreted by a computer • Physical design: a model built with a specific DBMS technology However, we will dedicate some discussion to the other lifecycle phases as well… Databases, Session 1, Introduction
Design is not a Science - It is a Skill ! The Design of a Database (and of IS in general) is Complex: • Many “moving parts” : Hardware, networks, user interfaces, back-end processes, administration, security, … • Many stakeholders with different goals • Rapidly-shifting requirements A design problem may have a few right solutions • …and many wrong or sub-optimal solution We cannot always find “the ultimate solution” – however, we should recognize solutions that are superior to others, and which solutions are wrong In this course (and in other IS-design courses), we do not learn formulas and proofs – we learn useful methodologies, tools, and techniques that can aid better design You become a good designer mainly through experience and practice ! Databases, Session 1, Introduction
The Scope of Database Design – A “Mini-World” When designing a database – our scope is a certain “mini world” The organization, department, and/or business processes that our database aims to support Databases, Session 1, Introduction
Information Systems and Business Processes Organization (firm) can be seen as a “system” that lies within a certain business environment Environment Organization External Business Processes (interaction with the environment): • Sales • Purchasing materials • Seeking information on competitors • Etc. Internal Business Processes (within the organization): • Paying salaries • Production • Managing inventories • Etc. Information Systems (IS) are designed to support business processes • Accordingly – the design of databases should be driven by business-process needs EIAR, Session 2
Supporting Business Processes with Websites Since the 1990’s, the Internet became a major IS platform for supporting business processes • Links to huge variety of information resources • Interaction with remote customers • Standardized interface • Relatively simple and fast software development Recently – we witness a major transition to smartphone applications The Internet supports different business-process types: • Distribution of information • Wikipedia, Ynet, YouTube • Selling products and services (E-Commerce) • Amazon, Zap, Travelocity • Social networks, recommendations • Facebook, LinkedIn, Twitter Databases, Session 1, Introduction
Business Processes Example Our demo example: Fandango (www.fandango.com) • What is the company’s core business? • What is the main service that the firm provides? How does it make money? • What business process types does the website support? • Information? • E-Commerce? • Social Network? Databases, Session 1, Introduction
Ticket Purchase at Fandango – A Key Business Process Begin • Self-learning of flowcharts: • http://en.wikipedia.org/wiki/Flowchart • Software that support flowchart shapes:Word, Power-Point, Excel, Visio Define Location/Date Check for Available Tickets and Shows Select a Show and the Number of Tickets Get Confirmation Yes Another show? No End Databases, Session 1, Introduction
Information need drive data requirements Begin • Customers: Name, Address, Phone number, Preferred payment method, … • Movies: Name, Genre, Actors, Length, … • Locations: State, City, Zip Code, … • Theaters: Name, Address, Number of Screens, … • Etc. Define Location/Date Check for Available Tickets and Shows Select a Show and the Number of Tickets Get Confirmation Yes Another show? No End Databases, Session 1, Introduction
The Course Project The project’s goal – implementing a prototype of a database that supports key business processes in a commercial website • A detailed project description is posted on Moodle Project Selection: By the end of this week, each team has to select a project from the list • Team: 3 members • The number of teams per web-site is limited to 3, on a ”first comes first served” basis • The selection will start on Monday (3.8) at 12:00, via Moodle Once your selection has been approved - no regrets… • Make sure to understand the web-site well before making the selection Part 1: Has to be submitted soon ! - A template with guidelines for part 1 has been posted on Moodle Databases, Session 1, Introduction