350 likes | 513 Views
Haas Career Management Center Database. IEOR 115 Ken Goldberg Final Presentation December, 10, 2011. Samuel Lin | AlliZheng | Tao Luo | Yorke Lee | Carney Lu Justin Pao | Butian Li | AdibKashem. Overview. Background.
E N D
Haas Career Management Center Database IEOR 115 Ken Goldberg Final Presentation December, 10, 2011 Samuel Lin | AlliZheng | Tao Luo | Yorke Lee | Carney Lu Justin Pao | Butian Li | AdibKashem
Background Haas School of Business Career Management Group • Provides candidates access to employers • Organizes on campus career events • Handles 1000+ electronic job postings per year • Ranked #4 by BusinessWeek survey • Maintains relationships with hiring managers and recruiters to track industry firm nights, career panels, corporate visits, alumni mixers and job fairs. • Plans workshops, panels, networking events and receptions
Background Existing Database Support Excel • Data stored in unrelated Excel spreadsheets • Custom reports formulated manually by tracking across many spreadsheets • Extremely tedious and time consuming • Possibility of Human Errors
Project Objective Clients Expectation • Ability to build customized reports instantly • Ability to search for records with specific properties • Ability to come up with intelligent recommendations for events, job listings etc. for students. • Reduce time and stress involved in data entry and management • Reduce human errors
Project Objective Database Objective • To provide ranking metrics to magazines such as US News • To provide all kinds of employment reports including, but not limited to: • Top Full-Time Employers • Class-wise Full-Time Employment Report • Employment By Industry/ Function • Advance functions • Salary Forecast • Class preferences based to career inclination • Search alumni nearby
Relational Schema Person(pid,fname, lname, mi, street_address, apt_no, zip_code4, salutation, gender) a. Student(sid,pid1, date_of_admission, date_of_graduation, department7) b. Professor(faculty_id,pid1) c. Supervisor(pid1, cid2) d. Staff(SSN,pid1) e. Advisor(SSN, pid1, department7) f. Philanthrophist(pid1) g. Potential_applicant(pid1, sourced_through) h. Subscriber(pid1) 2. Company (cid, name, address, liaison_name, liaison_tel, liaison_email)2a. Branch (cid2,branch no., address)3. University (uid, name, university_location4, liaison_name,liaison_tel, liaison_email)4. Location (zipcode, country, state, city, latitude, longitude)5. Job_Listing (listingid, title, description, job_location4, uploaded_by_company2, salary_range) 6. Job_Offer (sid1, company_id2, position, job_location4, salary, bonus, stock_options, date, accepted_or_declined) 7. Department (did, name, head_of_department1b)8. Course_instance (ccn, course_no8a, semester, instructor1b) 8a. Course(course_no, title, department7)
Relational Schema 9. Magazine_Publication (mid, name, address, liaison_name, liaison_tel, liaison_email)10. Data_Request (drid, date, magazine_or_publication9, data_requested, handled_by_staff1d, deadline, fulfillment_status)11. Ranking (rid, magazine_or_publication9, date, ranking_details)12. Social_Media_Account (acc_id, profile_link, staff_in_charge1d, acc_email, acc_password)13. Social_Media_Subscription(pid1, social_media_acc_id12, profile_email)14. Announcement (aid, date, title, description, posted_by_staff1d, posted_on_account12)15. Event (eid, created_by_staff1d, held_at_venue17, caterer_id16, title, date) a. Workshop (eid15, advisor_in_charge1e) b. Recruitment_Event (eid15, advisor_in_charge1e) c. Alumni_Mixer (eid15) d. Outreach_Event (eid15, hosted_by_university3) 16. Caterer (caterer_id, name, address, liaison_name, liaison_tel, liaison_email)17. Venue (vid, building, room, contact_name, contact_tel, contact_email)18. Donation (donation_id, donation_date, donation_amount, donated_by1, received_by1dspecial_notes)
Relational Schema • These are junction tables to help describe the n to nrelationships • 19. Working_Experience(sid1a, company_worked_for2, type_of_employment, start_date, end_date)20. Studied_At(sid1a, school_attended3, degree_type, field, start_date, end_date, GPA)21. Supervisor_Recommendation(sid1a, supervisor_pid1, cid2, date, letter)22. Professor_Recommendation(sid1a, professor_pid1, date, letter)23. Event_Attendance(eid15, attendee_pid1)24. Professor_In_Department(prof_pid1b, department7 )25. Event_Invite(eid15, cid2)26. Advisor_Hosts_Event(pid1,eid15) 27. Venue_Booking(staff_in_charge1d, booking_for_event15, date_reserved, venue_booked17)28. Caterer_Order(staff_in_charge1d, caterer_id16, date_of_order)29. Applies_For(sid, listingid, date, application) • 30. Student Takes Course in Semester Course_Taken(sid1, ccn8,grade) • 31. Professor Teaches Course in Semester Course_Taught(prof_pid1b, ccn8) • 32. Email_Address(pid1, email_address) • 33. Contact_Number(pid1, contact_no) • 34. Student_In_Department(sid1, department7) • 35. Advisor_In_Department(advisor_id1e, department7)
Normalization Unnormalized Relation Person (pid,fname, lname, mi, street_address, apt_no, city, state, country, zip_code, contact_no, email, salutation, gender) 1a. Student(sid,pid1, date_of_admission, date_of_graduation, department) 1e. Advisor(SSN, pid1, department7) 1NF Person (pid,fname, lname, mi, street_address, apt_no, zip_code4, salutation, gender) 1a. Student(sid,pid1, date_of_admission, date_of_graduation) 1e.Advisor(advisor_id,SSN, pid1) 32. Email_Address(pid1, email_address) 33. Contact_Number(pid1, contact_no) 24. Student_In_Department(sid1, department7) 25. Advisor_In_Department(advisor_id1e, department7)
Normalization Unnormalized Relation 1. Person(pid,fname, lname, mi, street_address, apt_no, city, state, country, zip_code, contact_no, email, salutation, gender) 2NF • Person (pid,fname, lname, mi, street_address, apt_no, zip_code4, salutation, gender) • 4. Location (zipcode, country, state, city, latitude, longitude) 3NF (Not Implemented) Location (zipcode, country, state, city) City_Location(Longitude,Latitude, City)
Normalization Unnormalized Relation 8. Course (course_id, title, department7, semester, instructor1b) BCNF 8. Course_instance (ccn, course_no8a, semester, instructor1b) 8a. Course(course_no, title, department7)
Queries Query 1: Salary forecast • Description: • Given a selected industry, predict the expected average salary for a new job position within that industry over the next X years • Useful to help job-seeking graduates easily visualize the salary trends for each industry
Queries Java Code: • Connects to Access database and executes the sql query: select o.year_offered, avg(o.salary) from [Job Offer] as o, Company as c, Department as i where o.status = 'Accepted' and o.cid = c.cid and c.iid = <iid> group by o.year_offered order by o.year_offered ASC; • This extracts the average salary for each year for the given industry. • The Java code performs a linear regression in order to determine the linear relationship between the year and expected salary, then creates a new table and inserts those predicted values (over the next X years) into the database
Queries Demo
Queries Query 2: Nearby alumni • Description: • Given a selected city, select all the alumni who are currently working within X km from that city • Used for current students to seek out seniors in the region (for advice, meet-ups), or to invite alumni to attend events that are being held in a certain city
Queries VBA Code: To retrieve all alumni working in nearby cities: SELECT s.SID, c.city, p.First_Name, p.Last_Name, o.position FROM [Job Offer] AS o, nearbyCities AS c, Student AS s, Person AS p WHERE o.sid=s.SID And o.job_location=c.LID And p.pid=s.pid; Contains function needed to determine distance between two geographical coordinates Iterates through location database and determines which cities are within x km of given city using the haversine formula: Creates and populates NearbyCities table with list of cities
Queries Demo
Queries Query 3: Donation Capacity • Description • Ranks alumni by their donation capacity • The donation capacity of each alumni is calculated by a weighted combination of several factors: salary, years out of school, donation history • The weights of these factors were researched and simplified from research papers) to result in the following formula: • Donation_Capacity = 0.26 * salary/10000 + 0.17 * years_out_of_school + 0.64 * sum_of_previous_donations/5000
Queries Retrieve the last recorded salary of each student SQL: SELECT s.sid, o.salary FROM student AS s, [Job Offer] AS o WHERE o.sid = s.sid and o.status = 'Accepted' and o.[date accepted] in ( select max(o2.[date accepted]) from [Job Offer] o2 where s.sid = o2.sid ); Retrieve the sum of all donations made by each student SQL: SELECT s.pid, Sum(d.donation_amount) AS total_donations FROM student AS s, donation AS d WHERE d.donated_by=s.pid GROUP BY s.pid;
Queries SQL: SELECT DISTINCT s.sid, p.First_name, p.Last_Name, ((0.26*ls.salary)/10000+(0.64*ds.total_donations)/5000+0.17*((Date()-s.[date of graduation])/365)) AS Donation_Capacity FROM person AS p, student AS s, donation AS d, (SELECT s2.pid, Sum(d2.donation_amount) AS total_donations FROM student AS s2, donation AS d2 WHERE d2.donated_by=s2.pid GROUP BY s2.pid) AS ds, (SELECT s3.sid, o3.salary FROM student AS s3, [Job Offer] AS o3 WHERE o3.sid = s3.sid and o3.status = 'Accepted' and o3.date accepted in ( select max(o2.[date accepted]) from [Job Offer] o2 where s3.sid = o2.sid )) AS ls WHERE s.sid=ls.sid And s.pid=p.pid And s.pid=ds.pid ORDER BY ((0.26*ls.salary)/10000+(0.64*ds.total_donations)/5000+0.17*((Date()-s.[date of graduation])/365)) DESC;
Queries Query 3: Donation Capacity
Queries Query 4: Magazine Influence • Description • Rank each business magazine/publication rankings by their influence on the number of job offers our graduates get • Help the career management center prioritize data requests from each magazine/publication • Perform a linear regression to obtain the coefficient of determination for each magazine
Queries Query 4: Magazine Influence • Java Code • Connects to Access database and executes the dynamically created SQL query for each magazine in the table: SELECT Count(*) AS Offer_Count, r.ranking, o.year_offered FROM [Job Offer] AS o, ranking AS r, [Magazine Publication] AS mp WHERE (((r.magazine_or_publication)=mp.mid) And ((o.year_offered)=r.year) And ((mp.mid)=<selected mid>)) GROUP BY o.year_offered, r.ranking; • This extracts a pairing of the number of offers received by graduates together with how highly we were ranked by that magazine that year. With these data pairs • Performs a regression to calculate the correlation factor between number of job offers received and our ranking in the magazine • Writes these correlation factors back into the database under a table called Influence
Queries Query 4: Magazine Influence SQL Code to retrieve magazines sorted by influence: SELECT i.mid, m.name, i.influence_factor FROM [Magazine Publication] AS m, Influence AS i WHERE m.mid=i.mid ORDER BY i.influence_factor DESC;
Queries Query 5: Course Recommendations • Description • Suggest recommended courses a student should take if he wants to enter a specific industry • Ranks each class by the number of times they were enrolled in by students who end up receiving offers from companies in a particular industry • SQL • SELECT c.course_id, c.title, Count(*) AS course_count • FROM course AS c, student AS s, [Job Offer] AS o, course_taken AS ct, company AS com • WHERE o.sid=s.sid And ct.sid=s.sid And ct.course_id=c.course_id And o.cid=com.cid And com.iid= <selected industry> • GROUP BY c.course_id, c.title • ORDER BY Count(*) DESC;
Queries Query 5: Course Recommendations
Q and As to ask