220 likes | 362 Views
CS 179 Database Project. Instructor: Dr Eamonn Keogh Computer Science & Engineering Department 318 EBII University of California - Riverside Riverside, CA 92521 eamonn@cs.ucr.edu Class web page www.cs.ucr.edu/~eamonn/cs179. Administration I. Class Meeting Times
E N D
CS 179 Database Project Instructor: Dr Eamonn Keogh Computer Science & Engineering Department318 EBII University of California - RiversideRiverside, CA 92521eamonn@cs.ucr.eduClass web page www.cs.ucr.edu/~eamonn/cs179
Administration I • Class Meeting Times • Class Activities: Discussion M 03:10 p.m. - 04:00 p.m. SPR 2339 LAB F 02:10 p.m. - 05:00 p.m. ENGR2 129 • (first 15 minutes rule) • We will not meet every week. You are obliged to view the class web page every Monday morning to check for announcements. • You are 100% responsible for any announcements/changes I might post to the web page.
Administration II Presentation of Final Project: You will need to give a short group presentation in the last two weeks (details later). You must show up to one of these final presentation sessions, or take a failing grade (exception, you work out an alternate plan with me by the end of week 4). Note that the sessions may go very late!! You must be prepared to stay for the entire sessions. Sign ups for time slots will be made available on a first-come first-served basis later in the quarter.
Administration III • Groups: • Groups may be of size 2 or 3. • Only one person who did not get an A or B in CS 166 may be in a group. (I may make exceptions if the numbers require it). • If you need to be a “group” of one, talk to me after class. • You should take your responsibility to your group seriously. • In most case I expect that everyone in the group will get the same grade, but I reserve the right to give different grades where warranted.
Administration IIII • Grading : • Project binder: 90% • Presentation (including demonstration of project): 10% • Your project binder (exhaustive details in class handouts) is a document in which you prove to me (or any reader) that you solved the problem given to you using a good design process. • It must be in the format explicitly stated in the handouts. • Your presentation is your chance to review and highlight the quality of your work.
Administration V Office Hours: I am normally in my office 6-7 days a week. You may visit me any time. If you wish to be 100% certain I am there you may make an appointment by email with at least 24 hours notice. (Note that if you make an appointment, and then fail to keep it or show up late, the grade for your entire group will suffer). If you email me, you must include “CS179” in the subject heading and note your group name (i.e. CS179-smith-jones-zoe) in the body.
Administration VI Important: If a member of your group commits an act of academic dishonesty, all members of the group will receive a failing grade! Don’t know the exact definition of academic dishonesty? It is your job to find out! (This is true in general, not just for this class). http://www.cs.ucr.edu/content/students/index.php?choice=academdis http://cnas.ucr.edu/~cnas/student/dishonesty.pdf There are certain rules which must be followed in this class, they are made clear on the handouts, follow them or get a written exception from me.
If you write In order to handle spatial data efficiently, as required in computer aided design, we decided to use an R-tree. We implemented it... Everyone in your group gets a failing grade. Instead you should write It was noted by Guttman [12] that “In order to handle spatial data efficiently, as required in computer aided…
Databases are applications which store data in a format which supports querying. • Imagine we have a database of restaurants in California. The database should probably be able to support queries like… • Return a list of all vegetarian restaurants. • Return the phone number of Marios Pizza on 123 Spruce st. • Return the restaurants that have a 4-star or higher rating. • However there are many reasonable queries that most of-the-shelf database systems do not support…. • Return a list of all restaurants with 5 miles of my house. • Return (in order of distance) the 3 pizza restaurants nearest to UCR.
Nearest neighbor query Range query
Your project is to build a database that supports spatial queries, as well as classic database queries. Although you could do this from scratch, I highly recommend that you do this by building some code that sits on top of an off-the-shelf database (ie Microsoft Access, Oracle, FoxPro, PostgreSQL). I also highly recommend that you do this by implementing an R-tree. In some sense the sentence above, “Your project is to build a database that…”, is misleading. I won’t be grading the quality of your database directly. Your project is really to demonstrate your ability to design medium to large scale software.
User Interface Spatial Search Engine (probably R-Tree) Classic Database
User Interface Enter an address and we will find the location of the nearest Californian university Spatial Search Engine (probably R-Tree) 221 Baker Street, Riverside Exclude Religious Schools Exclude Cal States Classic Database The nearest university is CSUSB. Click here for admissions information
User Interface Click on the map and we will find the location of the nearest Californian university Spatial Search Engine (probably R-Tree) Exclude Religious Schools Exclude Cal States Classic Database The nearest university is CSUSB. Click here for admissions information
User Interface Choose a location and we will find the location of the nearest Californian university Spatial Search Engine (probably R-Tree) LAX Golden Gate Bridge Balboa Park, SD Ontario Mills Classic Database Exclude Religious Schools Exclude Cal States The nearest university is CSUSB. Click here for admissions information
User Interface The GPS unit tells me you are in UCR, Riverside California. Do you want to know the location of the nearest University? Spatial Search Engine (probably R-Tree) Exclude Religious Schools Exclude Cal States Classic Database The nearest university is CSUSB. Click here for admissions information
To begin, you must come up with an application area which has a spatial element (I.e Restaurants in Orange County, California brown bear sightings, Locations of car crashes in Riverside). You must write a two page description of the problem, in the first person. The project description should begin by informally explaining the domain from the customer’s perspective (“As a restaurant critic… ”). Then explaining the utility of database for the customer (“The database will allow me to … …it will also help me…”). After I approve the project description, I (and/or our TA) will assume the role of the customer (I may add some requirements). Thereafter anytime you have a question about what the customer wants, you must come to see me. If you make an assumption, and it is the wrong assumption, you will have to redo your work, or take a major grade penalty.
How am I going to get the Spatial locations of 500 places? • The web. • A GPS unit. • Use a grid overlay. • If you use a grid overlay you must do it very carefully, and document the process. • Note that treating the problem as existing on a Euclidian plane is actually incorrect. Since the locations are on a sphere there will be an inherent error in the distances reported. This effect would not show up in an area the size of Riverside, but would show up for an area the size of California. However you may ignore it in this project.
Important Reminder Do not leave here today thinking… “how am I going to code this R-tree thing”, or “what language should I use”. Leave here thinking… “How is our group going to elicit the problem, design, build and test this piece of software? What is the best design process to use? How are we going to convince the professor, (with the contents of our project binder) that we used a high quality process to solve this problem?”. In particular, you probably want to spend a few weeks researching the design process before you even consider the particular application problem.